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Abstract 

Parallel  computers  have  not  yet  had  the  expected  impact  on  mainstream 
computing.  Parallelism  adds  a  level  of  complexity  to  the  programming  task  that 
makes  it  very  error-prone.  Moreover,  a  large  variety  of  very  different  parallel 
architectures  exists.  Porting  an  implementation  from  one  machine  to  another 
may  require  substantial  changes. 

This  thesis  addresses  some  of  these  problems  by  developing  a  formal  basis  for 
the  design  of  parallel  programs  in  form  of  a  refinement  calculus.  The  calculus 
allows  the  stepwise  formal  derivation  of  an  abstract,  low-level  implementation 
from  a  trusted,  high-level  specification.  The  calculus  thus  helps  structuring  and 
documenting  the  development  process.  Portability  is  increased,  because  the 
introduction  of  a  machine-dependent  feature  can  be  located  in  the  refinement 
tree.  Development  efforts  above  this  point  in  the  tree  are  independent  of  that 
feature  and  are  thus  reusable.  Moreover,  the  discovery  of  new,  possibly  more 
efficient  solutions  is  facilitated.  Last  but  not  least,  programs  are  correct  by 
construction,  which  obviates  the  need  for  difficult  debugging. 

Our  programming/specification  notation  supports  fair  parallelism,  shared- 
varia-ble  and  message-passing  concurrency,  local  variables  and  channels.  It 
allows  the  development  of  reactive  systems,  that  is,  possibly  non-terminating 
programs  designed  to  interact  persistently  with  their  environment.  Moreover, 
the  specification  of  liveness  properties  such  as  termination  or  eventual  entry  is 
supported  by  our  methodology. 

The  calculus  rests  on  a  compositional  trace  semantics  that  treats  shared- 
variable  and  message-passing  concurrency  uniformly.  The  refinement  relation 
combines  a  context-sensitive  notion  of  trace  inclusion  and  assumption-commit¬ 
ment  reasoning  to  achieve  compositionality.  Most  refinement  rules  are  syntax- 
directed  in  the  sense  that  each  rule  corresponds  to  a  specific  language  construct. 
The  calculus  straddles  both  concurrency  paradigms.  A  shared- variable  program 
can  be  refined  into  a  distributed,  message-passing  program  and  vice  versa.  More¬ 
over,  the  framework  naturally  extends  to  fine-grained  levels  of  concurrency. 

A  large  number  of  examples  illustrate  the  use  of  the  calculus.  A  complete 
derivation  of  an  n-process  mutual  exclusion  algorithm  is  given  and  more  eflRicient 
versions  are  developed.  The  all-pair,  shortest-paths  graph  problem  is  used  to 
show  the  derivation  of  a  distributed,  message-passing  program  from  a  shared- 
variable  parallel  version. 
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Chapter  1 

Introduction 


Parallel  computers  consist  of  several  computing  elements  connected  either  by  a 
shared-memory  space  or  by  a  communication  network.  They  support  the  execu¬ 
tion  of  more  than  one  operation  at  a  given  time  and  thus  allow  the  simultaneous 
execution  of  the  activities  necessary  for  solving  a  problem.  Computing  elements 
cooperate  either  by  reading  from  or  writing  to  the  shared-memory  space  or  by 
using  the  communication  network  to  send  and  receive  messages.  Programming 
parallel  computers  is  significantly  more  difficult  than  programming  uniproces¬ 
sor  machines.  It  presents  substantial  challenges  which  still  seem  to  impede  a 
more  widespread  use  of  parallel  computers.  This  thesis  addresses  some  of  these 
challenges.  We  suggest  to  structure  the  programming  process  by  means  of  a  par¬ 
ticular  formal  methodology.  We  intend  to  provide  a  solid  theoretical  foundation 
for  more  experimental  future  work. 


1.1  Programming  parallel  computers  is  difficult 

Compared  to  sequential  computers,  parallel  machines  offer  substantial  perfor¬ 
mance  advantages  at  a  relatively  low  cost  increase.  Moreover,  tools  for  par¬ 
allel  programming  are  available  and  are  becoming  rapidly  more  sophisticated: 
Graphical  user  interfaces  support  program  construction  by  interconnecting  pro¬ 
cesses  diagrammatically  [L'*"92].  Compilers  and  profilers  determine  potential  for 
parallelism  and  help  with  the  parallelization  of  existing  code  [BFG93,  Lev93]. 
Debuggers  use  graphical  ways  to  track  and  display  the  state  of  any  process  or 
thread  [Pan93].  However,  despite  their  appealing  performance-to-cost  ratio  and 
the  increasing  availability  of  tools,  parallel  computers  still  fail  to  have  the  ex¬ 
pected  impact  on  mainstream  computing.  A  close  look  at  the  state  of  the  art 
in  parallel  computing  suggests  at  least  two  reasons: 

•  Parallel  programming  is  inherently  complex.  Compared  to  sequential  pro¬ 
gramming  the  programmer  additionally  must  deal  with,  for  instance,  in¬ 
terference,  race  conditions,  process  creation  and  termination,  shared  re¬ 
sources  and  consistency,  synchronization  and  deadlock.  Successful  treat- 
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merit  of  these  issues  requires  knowledge  about,  for  instance,  the  location, 
interconnection,  and  relative  speeds  of  processors,  and  the  location  of  and 
access  to  data.  Dijkstra  has  pointed  out  the  competing  forces  governing 
the  representation  of  an  algorithm  in  form  of  a  sequential  program: 

“On  the  one  hand  I  knew  that  programs  could  have  a  compelling 
and  deep  logical  beauty,  on  the  other  hand  I  was  forced  to  admit 
that  most  programs  are  presented  in  a  way  fit  for  mechanical 
execution  but,  even  if  of  any  beauty  at  all,  totally  unfit  for 
human  appreciation.”  [Dij76,  page  xiii] 

This  tension  is  magnified  in  concurrent  programming.  In  parallel  pro¬ 
grams,  efficiency  in  terms  of  explicit,  finer-grained  parallelism  seems  to 
exclude  robustness,  maintainability,  and  verifiability. 

•  The  paradigms  and  patterns  of  program  execution  for  various  parallel 
architectures  differ  substantially.  This  lack  of  commonality  makes  par¬ 
allel  programming  very  architecture  dependent.  Consequently,  it  is  hard 
to  move  a  program  from  one  architecture  to  another.  Even  if  the  pro¬ 
gramming  environment  seems  similar,  the  underlying  communication  and 
synchronization  mechanisms  are  often  very  different.  Typically,  a  pro¬ 
gram  must  be  substantially  modified  to  take  full  advantage  of,  or  even 
to  execute  on,  a  different  architecture.  Moreover,  the  development  of  re¬ 
liable,  widely- applicable  performance  models  is  difficult.  The  change  in 
performance  caused  by  porting  a  program  may  be  very  hard  to  predict.  In 
short,  parallel  programs  typically  are  not  portable.  The  loss  of  portability 
in  turn  limits  the  expected  lifetime  of  parallel  implementations  and  their 
economic  viability. 

In  short,  parallel  programming  to  date  still  is  a  complex,  difficult  endeavour 
that  results  in  efficient,  yet  very  specialized  and  often  short-lived  programs. 

1.2  How  this  thesis  addresses  these  problems 

Traces  have  been  known  as  a  powerful  model  of  concurrency  for  a  long  time, 
e.g.,  [Abr79,  Par79,  Pnu85].  In  recent  work  [Bro96b,  Bro97],  Stephen  Brookes 
has  taken  a  particular  kind  of  trace,  called  transition  tracCj  and  shown  how 
they  give  rise  to  a  fully  abstract  model  for  concurrent  computation  that  is 
tractable,  supports  different  levels  of  granularity,  and  is  reasonably  architecture- 
independent.  Transition  traces  thus  make  an  excellent  candidate  for  a  model  for 
formal  parallel  program  design.  The  purpose  of  this  thesis  is  to  equip  Brookes^ 
model  with  a  viable  software  development  methodology.  In  particular^  we  propose 
a  refinement  calculus  that  allows  the  formal,  stepwise  development  of  shared- 
variable  and  distributed,  message-passing  parallel  programs  from  trusted,  ab¬ 
stract  specifications. 

A  refinement  calculus  consists  of  a  specification  and  programming  notation,  a 
refinement  relation  and  a  set  of  rules  that  govern  this  relation.  For  a  specification 
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and  programming  notation  to  provide  a  useful  basis  for  a  calculus  it  should  have 
certain  properties: 

•  The  notation  should  allow  the  expression  of  all  necessary  aspects  of  the 
development  process.  Thus,  it  should  leave  room  for  abstract,  nondeter- 
ministic  specifications,  but  also  for  concrete,  executable  programs.  Such 
a  language  has  been  called  a  wide- spectrum  language  [BBB"^85]. 

•  The  notation  should  support  fair  parallel  computation.  A  large  number 
of  fairness  notions  exist  [Fra86] .  We  will  concentrate  on  weak  fairness. 
More  precisely,  we  assume  that  every  process  that  is  enabled  continuously, 
will  eventually  be  allowed  to  execute.  Weak  fairness  is  a  useful  minimal 
assumption.  On  the  one  hand,  it  can  be  expected  to  be  met  by  any 
reasonable  scheduler.  On  the  other  hand,  it  frees  the  user  from  having  to 
consider  concrete  scheduling  policies  and  implementations  and  thus  helps 
to  avoid  overly  detailed,  operational  reasoning.  Fairness  is  a  good  example 
of  an  abstraction  that  hides  unnecessary  detail  while  preserving  essential 
properties. 

•  The  notation  should  support  shared- variable  and  message-passing  con¬ 
currency.  Both  are  equally  important  paradigms.  We  deem  a  uniform 
treatment  an  important  step  towards  architecture  independence. 

•  The  notation  should  support  local  variables  and  channels.  In  general,  it  is 
good  programming  style  to  limit  the  scope  of  variables  to  their  places  of 
usage.  This  principle  is  especially  true  for  parallel  programs,  because  scope 
and  locality  can  make  reasoning  about  parallel  programs  substantially 
more  tractable.  Knowing,  for  instance,  that  variable  x  is  local,  means 
that  other  processes  can  neither  read  nor  change  the  value  of  x.  Thus,  the 
environment  cannot  invalidate  program  properties  involving  x^  nor  can  it 
be  influenced  by  changes  to  x.  Local  variable  declarations  thus  provide 
another  useful  abstraction  tool.  On  the  one  hand,  they  allow  abstraction 
from  parts  of  the  internal  workings  of  the  body  of  the  declaration.  On 
the  other  hand,  they  allow  abstraction  from  the  possible  influence  of  the 
environment  on  certain  program  properties. 

Moreover,  the  refinement  relation  itself  also  should  meet  certain  minimal  re¬ 
quirements. 

•  The  relation  should  support  stepwise,  top-down  program  development  and 
compositional  reasoning.  Structured  programming  and  compositionality 
are  important  weapons  against  the  complexity  of  parallel  programming. 
To  be  most  effective,  refinement  should  support  sequences  of  small,  man¬ 
ageable  development  steps  and  thus  allow  the  exploration  of  different  de¬ 
sign  decisions  and  alternative  implementations.  Note  that  this  requires  the 
refinement  relation  to  be  reflexive  and  transitive.  The  soundness  proof 
of  each  step  should  be  compositional,  that  is,  refinement  between  two 
composite  programs  should  be  derivable  by  showing  refinement  between 


4 


CHAPTER  L  INTRODUCTION 


corresponding  subprograms.  The  refinement  of  a  large,  complex  program 
should  be  reducible  to  the  hopefully  easier  task  of  finding  refinements  for 
its  parts. 

•  The  relation  should  be  context-sensitive.  Typically,  refinement  is  carried  ® 

out  in  context.  That  is,  an  abstract  component  C  is  to  be  replaced  by  a 

more  concrete  one  C'  in  a  particular  context  E.  Using  information  about 

the  specific  nature  of  E  makes  this  replacement  substantially  more  pow-  ^ 

erful  and  may  provide  information  crucial  for  establishing  the  soundness 
of  the  refinement  step.  In  other  words,  C'  may  refine  C  only  in  context 
E.  In  all  other  contexts,  the  refinement  may  fail  and  replacing  C  by  C' 
would  be  unsound. 

♦  The  relation  should  support  the  introduction  of  local  variables  and  chan¬ 
nels.  On  the  specification  level,  computations  are  often  described  in  very 
abstract  terms.  During  the  course  of  the  refinement  it  may  then  be  nec¬ 
essary  to  flesh  out  the  implementation  details  of  these  abstract  computa¬ 
tions,  This  very  often  requires  the  introduction  of  local  variables,  which, 
for  example,  step  over  an  array,  or  compute  temporary  results.  Changes 
to  local  variables  should  be  unobservable  outside  their  scope. 

•  The  relation  should  allow  a  seamless  treatment  of  both  concurrency  para¬ 
digms.  It  should  be  possible  to  refine  a  shared- variable  program  into  a 
distributed,  message-passing  program  and  vice  versa. 

•  The  relation  should  support  the  specification  and  reasoning  about  live¬ 
ness  properties.  The  initial,  top-level  specification  of  the  program  to  be 
developed  must  be  able  to  express  liveness  properties.  In  the  concurrent 
world,  liveness  properties  are  important.  The  user  must  be  able  to  specify 
and  prove,  for  instance,  that  a  request  will  eventually  be  acknowledged, 
or  that  a  process  will  eventually  be  allowed  to  enter  a  critical  region. 

Finally,  note  that  the  rules  should  also  have  certain  properties. 

♦  The  set  of  rules  should  be  expressive^  that  is,  they  should  allow  the  devel¬ 
opment  of  a  substantial  class  of  interesting  programs. 

♦  The  set  of  rules  should  be  user-friendly^  that  is,  the  rules  should  not  be 
overly  cumbersome  or  too  numerous. 

Issues  not  addressed  in  this  thesis 

Data- reification,  sometimes  also  called  data-refinement,  allows  the  formal  re¬ 
placement  of  abstract  data  structures  by  more  concrete  and  implementable 
ones  [Hoa72,  DH72,  dRE99].  Although  it  is  an  important  part  of  formal  pro¬ 
gram  design  methodologies  [ReySl,  Jon90,  Spi92],  the  work  in  this  document  ^ 

will  not  attempt  to  incorporate  data-reification  into  the  framework.  We  regard 
data-reification  as  a  largely  orthogonal  program  development  technique  that 
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should  easily  mesh  with  our  work  presented  here  on  procedural  or  operation 
refinement. 

Moreover,  no  attempt  will  be  made  to  address  performance  or  complexity 
issues.  The  semantics  will  concentrate  on  the  sequences  of  states  a  program 
runs  through  during  execution  in  parallel  environments  and  will  thus  be  geared 
towards  correctness.  Consequently,  if  two  programs  exhibit  the  same  traces  they 
will  be  equated  in  the  semantics  regardless  of  their  performance  or  complexity. 
While  we  are  guardedly  optimistic  that  the  methodology  can  be  equipped  with 
cost  measures,  for  instance,  this  aspect  is  left  for  future  research. 

Finally,  implementability  issues  will  not  be  addressed.  As  mentioned  earlier, 
a  model  for  concurrent  computation  should  be  efficiently  implementable  on  a 
variety  of  architectures.  Currently,  we  do  not  know  to  what  extent  our  model 
has  this  property.  In  fact,  this  aspect  is  the  least  developed  in  our  research 
program.  A  lot  of  further  work  is  needed  here. 

1.3  What  makes  a  solution  difficult 

The  above  list  of  requirements  for  our  calculus  is  fairly  large.  Some  of  these 
requirements  are  in  tension  with  each  other  and  a  right  balance  needs  to  be 
found. 

Compositionality,  for  instance,  is  hard  to  achieve  in  the  presence  of  concur¬ 
rency  and  liveness  properties.  Assumption-commitment  reasoning  as  proposed 
by  Jones  yields  a  compositional  treatment  of  concurrency  upon  which  we  will 
build.  Moreover,  liveness  properties  are  often  very  difficult  to  prove  because  the 
proof  requires  a  global  view  of  the  entire  system.  Since  liveness  properties  are 
vital,  the  challenge  is  to  make  their  proofs  as  modular  and  thus  as  tractable 
as  possible.  Finally,  fairness  also  needs  to  be  modeled  compositionally.  Most 
treatments  of  fairness  are  operational  [Fra86,  A083]  and  a  denotational  ap¬ 
proach  is  still  widely  perceived  to  be  difficult.  However,  based  on  early  work 
by  Park  [Par79],  Brookes  and  Older  have  shown  that  this  perception  is  unjusti¬ 
fied  [Bro96b,  01d96]. 

A  suitable  wide-spectrum  language  needs  to  bridge  the  two  extreme  ends 
of  a  spectrum.  On  the  one  hand,  desired  properties  have  to  be  expressed  in 
a  high-level,  abstract  fashion.  On  the  other  hand,  the  low-level,  detailed  view 
of  an  executable  program  needs  to  be  supported.  Moreover,  both  safety  and 
liveness  properties  need  to  be  expressible. 

We  face  a  similar  situation  when  determining  the  rules  of  the  calculus.  A 
large  number  of  rules  guarantees  expressiveness  and  applicability  of  the  method¬ 
ology  in  a  large  variety  of  settings,  but  may  also  lead  to  confusion  and  thus 
impede  user-friendliness. 

Moreover,  the  wealth  of  paradigms  and  mechanisms  in  parallel  programming 
has  given  rise  to  an  even  more  confusing  wealth  of  different  models  for  these 
paradigms.  For  instance,  shared- variable  concurrency  has  been  modeled  using 
a  variety  of  state  traces  and  temporal  logics.  Message-passing  concurrency  on 
the  other  hand  has  been  modeled  using,  for  instance,  synchronization  trees 
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(CCS,  [Mil89]),  or  failure-divergence  traces  (CSP,  [BHR84]).  The  refinement 
of  a  shared- variable  program  into  a  distributed,  message- passing  program  (and 
vice  versa),  however,  requires  a  uniform  semantic  model. 

Finally,  the  introduction  of  parallelism,  synchronization  and  communication 
are  crucial  points  in  the  development.  It  is  not  clear  how  a  calculus  can  best 
support  them. 


1.4  Contributions  of  this  thesis 

We  define  a  refinement  calculus  that  satisfies  all  of  the  mentioned  requirements. 

We  present  a  wide- spectrum  language  that  is  general  enough  to  allow  the 
specification  of  the  desired  computation  in  very  abstract  terms.  However,  it  can 
also  encode  a  standard  language  with  fair,  shared-variable  parallelism,  synchro¬ 
nization,  message-passing  and  local  variables  and  channels. 

A  crucial  step  towards  a  refinement  calculus  is  the  definition  of  a  context- 
sensitive  notion  of  approximation.  It  allows  the  comparison  of  the  behaviour 
of  two  programs  with  respect  to  a  particular  environment.  This  capability  is 
extremely  useful  not  only  for  the  formulation  of  refinement  but  also  for  specifica¬ 
tion  and  verification  purposes.  The  definition  of  context-sensitive  approximation 
requires  a  slight  but  crucial  change  to  the  semantics  by  augmenting  treices  with 
labels. 

A  refinement  calculus  is  presented  that  supports  the  stepwise  development 
of  shared-variable  and  message- passing  parallel  programs  in  a  context-sensitive 
fashion.  The  rules  allow  the  introduction  of  local  variables  and  channels,  and 
the  proof  of  certain  liveness  properties.  Most  of  the  rules  are  compositional. 

A  variety  of  detailed  examples  illustrates  the  use  of  the  calculus  and  demon¬ 
strates  its  expressiveness  and  relative  ease  of  use.  One  of  these  examples  deals 
with  an  n-process  mutual  exclusion  algorithm,  contains  a  derivation  from  a  con¬ 
siderably  more  high-level  representation,  and  reveals  several  alternative  imple¬ 
mentations,  some  of  which  exhibit  more  parallelism  than  the  standard  textbook 
version. 


1.5  Brief  overview  of  related  work 

While  our  research  builds  on  a  very  large  body  of  existing  work,  it  draws  mainly 
from  the  following  three  sources.  A  more  detailed  discussion  of  related  work  can 
be  found  in  Chapter  9. 

The  choice  of  the  underlying  semantics  requires  a  close  look  at  models  for 
concurrent  programming.  We  have  chosen  Brookes  transition  trace  semantics, 
which  in  turn  was  infiuenced  by  Park’s  work  on  modeling  fairness  [Bro96b, 
Par79]. 

Research  on  compositional  proof  systems  for  concurrent  programs  has  suc¬ 
cessfully  reconciled  concurrency  and  compositionality  using  assumption-commit¬ 
ment  reasoning  [JonSl].  We  use  a  form  of  assumption-commitment  reasoning 
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that  is  due  to  Stirling  [Sti88]. 

Finally,  a  number  of  refinement  calculi  for  sequential  programs  have  been 
proposed  [Mor87,  Mor89,  Heh93].  This  work  provided  intuition  and  clarified 
some  general  questions  about  program  design  through  stepwise  refinement. 


1.6  The  structure  of  this  thesis 

Chapter  2  presents  the  syntax  and  semantics  of  the  language  we  use  to  express 
abstract  specifications  and  executable  programs.  The  semantics  is  given 
in  terms  of  traces.  Relevant  properties  of  the  semantics  are  discussed. 

Chapter  3  reviews  Jones’  original  formulation  of  assumption-commitment  rea¬ 
soning  [Jon81]  as  well  as  Stirling’s  reformulation  [Sti88].  Then,  Stirling’s 
approach  is  adjusted  to  our  setting  and  properties  are  presented. 

Chapter  4  discusses  different  ways  to  compare  the  behaviour  of  two  programs. 
One  leads  to  a  context-insensitive  notion  of  approximation  that  analyzes 
the  behaviour  of  a  program  regardless  of  the  environment  in  which  it  is 
executing.  The  deficiencies  of  this  relation  lead  us  to  a  context-sensitive 
notion  of  approximation.  Finally,  a  way  of  comparing  environments  with 
respect  to  their  capability  for  interference  is  defined. 

Chapter  5  takes  assumption-commitment  reasoning  and  context-sensitive  ap¬ 
proximation  and  merges  them  into  our  refinement  relation.  Properties  and 
rules  are  presented.  The  program  development  methodology  is  given. 

Chapter  6  contains  the  formal  derivations  of  four  shared-variable  programs. 
Section  6.1  contains  a  simple  example  to  illustrate  the  basic  use  of  the 
calculus.  Section  6.2  derives  a  shared- variable  parallel  implementation  of 
the  Floyd-Warshall  algorithm  for  computing  the  shortest  paths  in  a  graph. 
Section  6.3  derives  a  shared- variable  parallel  program  to  find  the  maxi¬ 
mum  in  an  array  of  integers.  The  derived  implementation  features  nested 
parallelism.  Alternative  derivations  are  discussed.  Section  6.4  treats  the 
generalization  of  the  maximum  search  problem:  The  first  element  in  an 
array  that  satisfies  a  property  is  to  be  found.  We  derive  a  shared- variable 
parallel  program  and  show  how  further  refinement  can  lead  to  more  effi¬ 
ciency. 

Chapter  7  contains  the  formal  derivations  of  two  distributed,  message-passing 
programs.  Section  7.1  discusses  the  prefix  sum  algorithm  (a  generaliza¬ 
tion  of  the  list  ranking  algorithm).  First,  a  shared- variable  solution  is 
derived.  Then,  a  distributed,  message-passing  implementation  is  obtained 
through  further  refinement.  Section  7.2  addresses  the  all-pair  shortest- 
paths  problem  in  a  graph.  Shared- variable  and  message-passing  programs 
are  derived.  We  show  that  not  every  shared- variable  solution  gives  rise  to 
an  efficient  distributed  implementation. 
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Chapter  8  discusses  an  n-process  mutual  exclusion  algorithm  called  the  tie¬ 
breaker  algorithm  or  Peterson’s  algorithm  [Pet81].  A  high-level  repre¬ 
sentation  is  given  and  verified.  The  textbook  implementation  is  derived 
together  with  with  several  other,  more  parallel  implementations. 

Chapter  9  discusses  related  work.  We  employ  a  taxonomy  that  separates  se¬ 
quential  from  parallel  approaches,  compositional  proof  systems  from  pro¬ 
gram  transformation  systems  and  refinement  calculi. 

Chapter  10  presents  future  work.  We  distinguish  between  immediate  improve¬ 
ments  of  the  framework,  and  more  long-term  extensions.  A  number  of 
potential  areas  of  application  are  discussed. 

Chapter  11  concludes. 

Appendix  A  contains  the  soundness  proofs  of  the  refinement  rules.  Moreover, 
the  proofs  of  certain  lemmas  needed  in  Chapters  6  and  7  are  collected 
here. 


Chapter  2 

Programs,  contexts  and 
traces 


One  of  the  hallmarks  of  refinement  calculi  is  that  typically  specifications  and 
programs  are  expressed  within  the  same  formalism,  and  they  are  neither  syntac¬ 
tically  nor  semantically  distinguished.  Programs  are  specifications  that  happen 
to  be  executable.  Languages  that  aim  at  capturing  all  aspects  of  the  program 
development  process  have  also  been  called  wide- spectrum  languages  or  mixed 
languages  [BBB'^85]. 

In  this  section,  we  present  the  syntax  and  semantics  of  the  language  we  will 
use  to  express  both  high-level,  abstract  specifications  and  low-level,  concrete 
programs.  Throughout  this  document  the  term  “program”  will  denote  an  ele¬ 
ment  of  this  language  and  thus  either  be  an  non-executable  specification  or  a 
standard,  executable  program. 


2.1  Syntax  of  programs  and  contexts 

We  start  by  discussing  program  variables,  and  atomic  and  composite  programs. 

2.1.1  Program  variables 

Let  Var  denote  the  set  of  all  program  variables.  Without  loss  of  generality 
we  assume  Var  to  be  finite.  Typical  examples  for  program  variables  used  in 
this  document  are  a?,  y,  mul,  and  A[l].  Every  variable  x  has  a  set  Dorux 
associated  with  it  that  contains  the  values  that  x  can  take  on. 

2.1.2  Atomic  statements 

Our  notion  of  program  allows  for  very  abstract  descriptions  of  computations. 
The  most  basic  program  component  specifies  a  single  atomic  transition  and  is 
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called  an  atomic  statement.  Atomic  statements  are  inspired  by  Carrol  Morgan’s 
specification  statement  [Mor89]  and  are  of  the  form 

V:[P,Q] 

where  V  is  a  finite  set  of  variables,  sometimes  also  called  a  frame ^  and  P  and 
Q  are  predicates.  The  atomic  statement  above  describes  atomic  transitions 
from  initial  states  satisfying  P.  More  precisely,  an  initial  state  satisfying  P  is 
transformed  in  one  step  into  some  final  state  satisfying  Q  by  only  changing  the 
variables  in  V.  If  the  initial  state  does  not  satisfy  P,  the  statement  does  not 
offer  any  transitions.  ^  For  instance,  a  random  assignment  which  may  set  x  to 
any  natural  number  is  described  by 

{x]:[ti^  >  0] 


which  we  abbreviate  by 

x:[tty  X  >  0]. 

An  idling,  or  stuttering,  step  is  expressed  as 

skip  =  9:[ttytt], 


To  be  able  to  refer  to  the  value  a  variable  held  initially,  that  is,  at  the  beginning 
of  the  transition,  we  reserve  ‘‘hooked”  variables  x  in  Q.  It  is  easy  to  see  that 
atomic  statements  subsume  simple  and  multiple  assignments.  The  meaning  of 
the  simple  assignment  x:=x-\-  1  and  the  multiple  assignment  ar,  +  1, 0,  for 
example,  are  captured  by 

x:[ti,x  =x  +1] 


and 

{x,y}:[ii,x  =x  +1  Ay  =  0] 


respectively. 

If  a  predicate  does  not  contain  hooked  variables  it  is  called  unary.  Otherwise 
it  is  called  binary.  In  an  atomic  statement  K:[P,  Q],  P  must  be  unary,  whereas 
Q  may  be  unary  or  binary.  Given  a  set  of  variables  V  C  Var,  the  set  of  all 
unary  predicates  whose  free  variables  are  in  V  is  denoted  by  Pr€ds{V).  Thus, 
Preds(0)  denotes  the  set  of  all  closed  predicates^  that  is,  predicates  without  free 
variables.  For  instance,  true  and  /a/se,  abbreviated  by  it  and  ff  respectively,  are 
members  of  Pred5(0),  as  are  3  >  4  and  1  >  0.  Moreover,  Preds(Var)  denotes 
the  set  of  all  predicates  over  all  variables. 

The  semantics  of  atomic  statements  is  conveniently  captured  by  character¬ 
istic  formulas. 

^Note  that  our  treatment  of  this  case  differs  from  Morgan’s.  In  [Mor89],  the  behaviour  of 
the  specification  statement  is  completely  unconstrained  if  the  precondition  is  not  met.  See 
Section  9.1.1  for  more  details. 
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Definition  2.1  (Atomic  statements) 

Let  L  be  a  meta.variable  that  ranges  over  program  variables.  Given  an  atomic 
statement  V:[P,Q],  its  characteristic  formula  cfy.^pQ-^  is  given  by  the  predicate 

<^fv:[p,Q]  =  P  A  Q  A  Vi  G  Var\V.L 

where  P  abbreviates  the  substitution  of  all  free  variables  in  P  by  their  hooked 
counterpart.  We  interpret  a  binary  predicate  Q  over  pairs  of  states  (s,  5')  where  s 
assigns  values  to  hooked  variables  and  s'  to  the  unhooked  ones.  More  precisely, 
{s^s')  1=  Q  iff  replacing  the  hooked  variables  in  Q  by  their  values  in  s  and 
replacing  the  unhooked  variables  in  Q  by  their  values  in  s'  makes  Q  true.  □ 

For  instance,  the  statement  x,  y:=x  -|-  1,0  has  the  characteristic  formula 

cfx,y:=x+i,o  =  X  =x  Al  A  y  =  Q  L  E  Var\{x ,  y} .L  =7  . 

Characteristic  formulas  allow  us  to  conveniently  determine  if  a  transition  (s,  s') 
conforms  with  a  particular  atomic  statement  V:[PjQ].  More  precisely,  the  ex¬ 
ecution  of  V:[P,Q]  from  the  initial  state  s  could  result  in  the  final  state  s'  iff 
(5,5')  1=  cfy.[pQy  In  Section  2.2.2,  the  semantics  of  an  atomic  statement  will 
be  defined  as  the  set  of  transitions  that  satisfy  its  characteristic  formula. 

2,1.3  Composite  programs 

More  complex  programs  can  be  built  from  atomic  statements  using  sequential 
and  parallel  composition,  disjunction,  iteration,  and  hiding.  Let  C  and  D  range 
over  programs.  An  important  extension  to  the  standard  shared- variable  parallel 
language  involves  labels.  Syntactically,  we  allow  programs  to  be  enclosed  in  a 
pair  of  angle  brackets  (_).  The  following  grammar  generates  programs  that 
contain  zero  or  more  subprograms  enclosed  in  angle  brackets. 

C  ;:=  V-.[P,Q]  \ 

C'i;C2  I 
C1IIC2  I 
C1VC2  I 
C*  I 
(7“  I 

new  a:  =  e  in  C  | 

{D) 

D  ::=  V:[P,Q]  \ 

Di-,D2  I 
Di\\D2  I 
Di  V  D2  I 
D*  I 

I 

new  X  =  ein  D 
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where  e  ranges  over  constants  and  variables.  Note  that  angle  brackets  cannot 
be  nested,  A  program  that  contains  exactly  one  subprogram  enclosed  in  angle 
brackets  is  called  labeled.  A  program  that  contains  no  angle  brackets  is  unlabeled. 
To  motivation  this  extension  of  the  language,  consider  an  unlabeled  parallel 
composition  Ci  ||  C2  and  suppose  we  want  to  refine  Ci.  To  this  end,  it  will  be 
necessary  to  to  distinguish  the  transitions  of  Ci  from  those  of  C2.  Labels  allow 
us  to  achieve  this  distinction.  As  we  will  see  in  Section  2.2,  the  transitions  of  Ci 
in  the  labeled  program  (Ci)  ||  C2  cannot  be  confused  with  those  of  C2  regardless 
of  the  shape  of  Ci  and  C2. 

Example  2.1  (Well-formed  labeled  and  unlabeled  programs) 

Both  programs  below  are  well- formed.  Ci  is  unlabeled  and  C2  is  labeled. 

Cl  =  mul:[tt,  mul  =^mul  -h  x] ;  cnt:[tt,  cni  —cnt  —1] 


x,y}:[tt,x  >  OAy  >  0]; 

new  cnt  =  2/  in 

mul:[tt,  mul  =  0]; 

sum:[tt,sum  =zx  -h  y] 

(({cnt  >  0}  ;  Cl)*  ;  {cnt  <  0}) 

end 

However,  neither  C3  nor  C4  is  well-formed. 

C3  =  >  0])  II  (y:[<^,i/>  0]) 

C4  =  {tmp:[tt,tmp  =S] ;  x:[U,  x  =y] ;  {y:[tt,y  =^imp])) 

□ 


Definition  2.2  (Abbreviations) 

Let  C^  stand  for  finite  and  infinite  iteration  over  C  and  let  denote  finite, 
non-trivial  iteration  over  C,  that  is, 

=  C*WC^ 

=  C;C*. 


□ 


The  set  of  free  variables  in  a  program  is  defined  as  usual. 

Definition  2.3  (Free  variables  of  a  program) 

Given  a  program  C,  the  set  free  variables  fv{C)  in  C  is  given  by: 


MV:[P,Q]) 

MCi;C-2) 

MiC)) 

MC1VC2) 

MC*) 

Mcn 

MC1IIC2) 

fv{nBW  a:  =  e  in  C) 


VUfviP)Ufv{Q) 
MCi)  Ufv(C2) 
MO 

MO)  ofv(C2) 

MO 

MO 

MO)  UMC2) 

MC)\{x}  U/t)(e) 
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where  fv{P)  denotes  the  set  of  variables  occurring  free  in  predicate  P  and  e  is 
either  a  constant  or  a  variable.  □ 

2.1.4  Contexts 

Contexts  play  an  important  role  in  our  work.  They  denote  the  environment 
that  a  program  might  be  executing  in.  Formally,  a  context,  ranged  over  by  E, 
is  an  unlabeled  program  with  exactly  one  hole. 

E  []  I 

C;E  I 
E-,C  I 
CVE  I 
EVC  I 
C||E  I 
E\\C  \ 

I 

Euj  I 

new  X  =  em  E 

A  context  E  gives  rise  to  a  program  E[C],  formed  by  replacing  the  hole  in  E 
by  C.  E[C]  can  be  defined  by  straightforward  structural  induction  on  E.  Note 
that  placing  a  context  Ei  inside  another  context  E2  yields  the  context  E2[Ei]. 
Very  often,  we  will  consider  a  labeled  program  {C)  in  some  context,  that  is, 
E[{C)]  yields  the  labeled  program  that  is  obtained  by  replacing  the  hole  in  E 

by  (C). 

We  call  a  context  E  parallel^  if  the  hole  is  in  the  scope  of  a  parallel  compo¬ 
sition.  Formally,  E"  is  a  parallel  context  if  there  exist  contexts  Ei  and  E2  and 
a  program  C  such  that  E  is  of  the  form  E\[E2  ||  C].  The  hole  in  E  is  inside  E2 
which  is  under  the  scope  of  a  parallel  composition.  A  context  is  sequential  if  it 
is  not  parallel. 

2.2  Semantics  of  programs 

Before  the  semantics  of  labeled  and  unlabeled  programs  can  be  given,  we  need 
to  introduce  labeled  transition  traces.  These  traces  will  be  used  in  Sections  2.2.2 
and  2.2.4  to  define  a  sequence  of  increasingly  coarse-grained  semantics  for  the 
language  we  have  just  presented.  Section  2.3  shows  how  our  programming  lan¬ 
guage  can  be  used  to  encode  the  standard  programming  language  constructs. 
Section  2.4  sketches  how  the  semantics  can  be  extended  to  model  finer-grained, 
and  thus  more  realistic,  notions  of  concurrency  like  non-atomic  assignments  and 
non- atomic  expression  evaluation. 

2,2. 1  Labeled  transition  traces 

Throughout  this  document,  s,s',si  G  S  denote  states,  that  is,  complete  map¬ 
pings  from  the  finite  set  of  all  program  variables  Var  to  values.  We  will  use 
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a  particular  kind  of  trace,  called  transition  trace^,  to  model  programs.  Transi¬ 
tion  traces  have  proven  very  useful  for  the  definition  of  compositional  models 
of  shared- variable  concurrency  [Par79,  dBKPR91,  Bro96b].  One  such  trace  is  a 
finite  or  infinite  sequence  of  the  form 

(so,So)(si,si) 

and  thus  represents  a  possible  “interactive”  computation  of  a  command  in  which 
state  changes  made  by  the  command  (from  Si  to  s*)  are  interleaved  with  state 
changes  made  by  its  environment  (from  sj  to  The  meaning  of  a  program 

is  given  by  a  set  of  transition  traces.  To  describe  the  meaning  of  a  labeled 
program  (Ci)  ||  C2  we  will  consider  labeled  transition  traces  of  the  form 

(^Oj  IQ)  ^o)(^1  )  ^1)  ♦  •  •  (^i  ihy  ' 

where  each  transition  carries  a  label  /  from  the  set  A  =  {p,  e},  A  transition 
labeled  with  p  was  caused  by  a  statement  inside  the  angle  brackets,  that  is, 
by  Cl,  and  is  called  a  program  transition.  A  transition  labeled  with  c  is  due 
to  C2  and  is  called  an  environment  transition.  A  tr^tce  consisting  of  program 
transitions  only  is  called  a  program  trace.  Analogous  for  environment  traces. 
By  describing  a  labeled  program  by  means  of  labeled  transition  traces  we  thus 
regard  it  as  an  open  system  while  singling  out  the  transitions  made  by  a  specific 
part  of  the  program.  In  other  words,  (Ci)  ||  C2  can  be  thought  of  as  an  open 
system  whose  environment  is  known  to  at  least  comprise  C2. 

We  now  define  a  few  operations  on  traces  and  sets  of  traces.  Let  T,  Ti ,  and 
T2  range  over  sets  of  labeled  transition  traces.  The  concatenation  operation 
Ti ;  T2  and  the  infinite  iteration  operation  T^  are  defined  as 

Ti ;  T2  =  {a/?  I  a  E  Ti  A  /?  E  T2} 

=  {ao...cvn.,.|Vz>0.a,ET}. 

T*  denotes  the  smallest  set  containing  T  and  the  empty  trace,  closed  under 
concatenation,  that  is. 


T" 


where 


T^  — 


{e} 

T;r". 


Moreover,  is  like  T*  except  that  it  does  not  contain  the  empty  trace,  that 
is. 


T+ 


=  T;T*. 


^ Sometimes  also  called  potential  or  partial  computations  or  extended  sequences. 
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denotes  T*  U  .  Fair  parallel  composition  is  modeled  by  fair  interleaving 
of  sets  of  traces 

Ti  II  T2  =  [JWi  II  0(2  I  G  Ti  A  0^2  G  T2} 


where  Q'  ||  /?  is  the  set  of  all  traces  built  by  fairly  interleaving  a  and  (3.  One  way 
to  define  a  ||  /?  formally  can  be  found  in  [Bro96b]. 


a||^ 

fairmerge 


L 

R 

A 


{7  I  (a,/3,j)  e  fairmerge) 

(L*RR*L)^  U(L  UR)* A 

I  (s,l,s')  e  (E  X  A  X  E)} 
l(e,(sJ,s'),(s,fs'))j(s,fs')e(ExAxE)} 
{(a,€,a)lae(ExAxE)^} 
U{(6,A/?)|/?G(ExAxE)-} 


where  concatenation  and  iteration  are  extended  to  sets  and  triples  of  traces  in 
the  obvious  way:  AB  =  {a/3  j  a  E  A  A  /3  E  B)  and  (ai,  5 /^2j /^s)  = 

(ai/3i,a2/l2, 0(3/13) • 

Let  V  range  over  values  (constants)  over  some  domain.  We  write  [s\x  i;] 
to  denote  the  state  that  is  like  s  except  that  the  value  of  x  is  updated  to  v. 
Let  a  =  (50, /o)  ^i)  •  •  •  (^2  5  •  ^e  a  transition  trace.  The  trace 

(x  =  t;)a  is  like  a  except  that  x  is  initialized  to  v  in  the  first  state  and  that 
the  value  of  x  is  retained  across  points  of  possible  interference.  More  precisely, 
(a?  =  v)a  is 

([sok  =  ■y],/o,So)([sik  =  So(a;)],/i,si) ..  .([Si|a;  =  s-_i(a;)], /,•,«') . . . 


The  trace  a\x  on  the  other  hand  describes  a  computation  like  a  except  that 
it  never  changes  the  value  of  x.  That  is,  a\x  is 

(^0  1  ^0)  [^0  i  ^  ~  ^0  (^)])  (^1 )  ^1  j  [^1  I  ^  ~  (^)])  •  *  I  ^  —  ^2  (^)])  •  •  •  • 


2,2,2  The  fine-grained  semantics  T 

We  are  now  ready  to  present  the  first  semantics. 

Definition  2.4  (Semantic  map  T) 

1.  Let  V{T)  denote  the  set  of  all  subsets  of  T.  The  semantic  function  T 
maps  the  set  of  labeled  and  unlabeled  programs  to'P((SxAxE)'^)  and 
is  defined  as  7^[_]  where  71  [J  for  /  G  A  is  given  by 


TilV:[P^Q]j 

TeliC)} 

7I[Ci;C2l 
T1IC1VC2} 
TilCi  II  C2I 
TtlC*} 
TilC-J 
7/[new  X  =:  e  in  CJ 


{(s,l,s')  I  (s,s')  1=  cfv:[P,Q]] 

TplCl 

TilCilTilC^I 
7I[CilU7IIC2l 
TiiCij  II  riiC2! 

{TilC})* 

{TilCjr 

{a\ic  I  first{a)  e  =  v  A  (x  =(  v)a  E  7I|C']} 
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where  first{a)  denotes  the  first  state  of  a. 

2.  A  trace  {soJq,Sq){siJi,Si)  ...  is  interference^free  if  we  have  sj  =  for 
all  f  >  0.  The  executions  ^|CJ  of  a  program  C  are  its  interference- free 
transition  traces.  Let  C  be  a  labeled  or  unlabeled  program.  Then, 

^[C]  =  {a  G  TfC]  I  a  is  interference-free}. 

3.  The  execution  corresponding  to  a  program  trace 

iso,P,So){si,p,  s'l)... 

is 

(so,p,  So)ao(si,p,si)ai ... 

where  a,  =  e  if  sj-  =  s,+i  and  a,-  =  (s(-,  e,  s,+i)  otherwise  for  all  i  >  0. 

4.  Cl  =r  C2  and  Ci  =s  C2  abbreviate  TfCi]  =  T[C2]  and  ^[Ci]  =  ^[C2l 
respectively. 

□ 

This  definition  is  inspired  by  Brookes’  transition  trace  semantics  [Bro96b]. 
Brookes,  however,  does  not  use  labels.  The  semantic  mapping  handles  labels 
by  using  a  subscript  that  indicates  whether  the  argument  is  inside  a  label  or 
not.  The  subscript  e  indicates  that  the  argument  is  not  inside  a  label.  The 
subscript  p  indicates  that  the  argument  is  inside  a  label.  Thus,  the  denotation 
of  the  overall  program  is  computed  using  7^.  If  during  the  computation  of  the 
semantics  a  labeled  subprogram  is  encountered,  the  subscript  changes  from  e  to 
p.  Note  that  the  subscript  never  changes  in  the  opposite  direction,  that  is,  from 
p  to  e.  Also,  note  that  Tpl{C)}  is  not  defined.  Since  labels  cannot  be  nested, 
this  case  cannot  occur. 

The  traces  of  new  a:  =  e  in  C  do  not  change  the  value  of  x  and  are  obtained 
by  executing  C  under  the  assumption  that  x  is  set  to  the  value  of  e  initially  and 
that  the  environment  cannot  change  the  value  of  x. 

Not  every  program  has  a  non-empty  denotation  under  T-  The  program 
for  instance,  has  no  traces  associated  with  it.  Moreover,  not  every 
program  that  has  traces,  has  executions.  The  program 

x:=0  ;  0:[x  =  1,  x  =  1] 

for  instance,  has  traces,  but  no  executions.  The  final  states  of  the  first  assign¬ 
ment  and  the  initial  states  of  the  second  statement  do  not  overlap.  Intuitively, 
the  control  flow  “has  no  place  to  go” .  While  programs  with  no  executions  or 
traces  are  allowed  in  our  specification  language,  we  typically  do  not  use  them. 
Because  they  introduce  the  possibility  of  trivial  refinements,  we  will  single  out 
an  important  subclass  of  programs  that  never  have  an  empty  set  of  traces  or 
executions  in  Section  2.3.10. 
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2,2.3  Closure  conditions 

The  semantic  map  T  is  very  fine  grained.  For  instance,  it  is  sensitive  to  the 
number  of  transitions  a  program  can  take.  Thus,  C  and  C  ;  skip,  for  example, 
are  distinguished,  as  are  x:=l  and  new  y  ==  0  in  t/:=l  ;  x:=y.  We  will  now 
introduce  two  closure  conditions,  which  make  T  more  coarse-grained.  These 
closure  conditions  are  inspired  by  the  stuttering  and  mumbling  closure  condi¬ 
tions  proposed  by  Brookes  to  achieve  full  abstraction  [Bro96b].  In  his  setting, 
the  closure  conditions  correspond,  respectively,  to  reflexivity  and  transitivity  of 
the  — relation  in  a  conventional  operational  semantics.  The  addition  of  labels, 
however,  forces  us  to  make  slight  adjustments  to  Brookes’  definitions.  The  intu¬ 
ition  behind  these  adjustments  is  to  keep  program  and  environment  transitions 
distinct,  that  is,  for  instance,  an  unlabeled  program  C  will  have  environment 
transitions  only  and  no  program  transitions.  Also,  every  trace  of  Ci  ;  (C2) ;  C3, 
for  example,  will  contain  at  least  one  program  transition. 

Definition  2.5  (Closure  conditions) 

Let  T  be  a  set  of  traces. 

#  The  s~closure  of  T,  denoted  is  the  smallest  set  which  contains  T  and 
satisfies: 

~  Stuttering: 

1.  Finite  case:  If  a(s,/,5')^  G  then  /,  s")(s, /,  s')/3  G 

and  a(s, /,  /,  s'')/?  G  for  all  s",  and 

2.  Infinite  case:  if 


aQ{soJo,SQ)oii{si,li,s[)a2  •  sj) . . .  G 


then 

aol3oail3i  . .  . . .  G 

where  for  all  i  >  0, 

A'  —  i^i  :  h  )  ) 

or 

for  some  s'/. 

•  The  sm-closure  of  T,  denoted  by  T^,  is  the  smallest  set  which  contains  T 
and  satisfies: 

—  Stuttering:  as  before. 

-  Mumbling: 

1.  Finite  case:  If  a(s, /,  s')(s', /,  s")/?  G  then  a(s,/,s")/?  G 
and 
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2.  Infinite  case:  if 


ao(so ,  lo,  So)(so.  h,  So)»i(si  >  .  «i )«2  •  •  •  G 


then 

ao(so,lo,so)“i(*i)^i>40®2  . . .  G  T*. 

□ 

The  stuttering  condition  makes  the  semantics  insensitive  to  finite  amounts  of 
stuttering.  A  stuttering  step  with  label  I  can  only  be  inserted  in  the  neighbour¬ 
hood  of  an  already  existing  transition  with  label  1.  Note  that  we  can  stutter 
at  infinitely  many  places  in  an  infinite  trace.  However,  stuttering  cannot  turn 
a  finite  trace  into  an  infinite  trace.  If  two  adjacent  transitions  share  the  in¬ 
termediate  state  and  have  the  same  label,  the  mumbling  condition  allows  the 
absorption  of  that  state.  Thus,  mumbling  across  label  boundaries  is  not  per¬ 
mitted.  Note  that  we  can  mumble  at  infinitely  many  places  in  an  infinite  trace. 
However,  mumbling  cannot  turn  an  infinite  trace  into  a  finite  trace. 

The  fine-grained  semantics  was  given  in  terms  of  four  operations  on  sets  of 
traces.  We  define  closed  variants  of  these  operations.  The  s-closed  concatenation 
Ti  d  T2  and  the  sm-closed  concatenation  Ti  d  are  defined  as 

Ti  ;t  Tz  =  {a0\aeTiAf3eT2V  =  (Ti;T2)^ 

Ti  ;t  Tz  =  {a/?  I  a  G  Ti  A  /?  €  Tz}*  =  {Ti  ;T2)K 

The  closed  infinite  iteration  operations  and  are  given  by 

T^^  =  {ao...a„...  I  Vi>  0.a<  €T}t  =  (T")t 

7^*  =  {ao...a„...  I  Vi>  O.ai  GT}^  =  (T^)K 

The  sets  and  denote  the  smallest  sets  containing  T  and  the  empty 
trace,  closed  under  s-closed  concatenation  and  sm-closed  concatenation  respec¬ 
tively.  The  s-closed  parallel  composition  Ti  ||^  T2  and  the  sm-closed  parallel 
composition  Ti  ||^  T2  are  defined  as 

Ti  ||t  Tz  =  UWi  II  az  I  ai  G  Ti  A  az  G  rz}t  =  (ri||r2)t 

Ti  II*  Tz  =  U{«i  II  ^2  I  ai  G  Ti  A  az  G  Tz}*  =  (Ti||Tz)* 

where  the  fair  merge  of  two  traces  a\\j3  is  defined  as  before  in  Section  2.2.1. 

2.2.4  Two  more  coarse-grained  semantics  and 

We  use  the  closure  conditions  to  define  two  semantics  that  are  more  coarse¬ 
grained  and  thus  more  suitable  for  our  purposes  than  T.  Both  definitions  differ 
from  the  definition  of  T  only  in  their  use  of  the  closure  conditions. 

Definition  2.6  (Semantic  maps  and  T^) 
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1 .  Let  (T)  denote  the  set  of  all  subsets  of  T  that  are  closed  under  stut¬ 
tering.  The  semantic  function  maps  the  set  of  labeled  and  unlabeled 
programs  C  to  V^{{E  x  A  x  E)°^)  and  is  defined  as  where  7^^|-]  for 
/  E  A  is  given  by 


T,^lV:[P,Q]l  = 
Vl{C)}  = 
7;^ICi;C2l  = 
7;t[CiVC'2l  = 
7;^ICi||C2l  = 

Vici  = 
= 

Ti  [new  a;  =  e  in  C|  = 


{{S,1,S')  I  (S,S')  1=  cfv:[p,Q]V 

VlCj 

7;^[CiluV[C2l 
Ti^lCil  ir  VIC2I 

(Vicjr' 

{a\a;  |  first{a)  |=  e  =  t)  A  (a;  =  v)a  £ 


2.  Let  (T)  denote  the  set  of  all  subsets  of  T  that  are  closed  under  stuttering 
and  mumbling.  The  semantic  function  maps  the  set  of  labeled  and 
unlabeled  programs  C  to  'P^((E  x  A  x  E)°°).  It  is  defined  just  like 
except  that  every  occurrence  of  j  is  replaced  by  t. 

3.  The  semantic  maps  and  return  the  executions  of  closed  sets  of  traces. 
That  is, 


=  {a  £  I  a  interference-free} 

and 

==  {a  E  I  a  interference-free}. 

Cl  -rt  C2  and  Ci  C^t  C2  abbreviate  T^Cil  -  VIC2}  and  T^lCij  C  VIC2} 
respectively.  Similarly  for  ^ ^ ,  and  .  □ 

Note  that  the  denotation  of  a  program  C  under  is  equivalent  to  the  denota¬ 
tion  of  C  under  T  closed  up  under  stuttering.  An  analogous  property  holds  for 
the  denotation  of  C  under  .  Formally,  we  have 

T\ci  =  (r[ci)^ 

T^ici  =  (rici)*. 

Also  note  that  under  the  closure  conditions  an  unlabeled  program  C  continues 
to  have  environment  transitions  only,  while  a  labeled  program  like  (C)  still  has 
only  program  transitions. 

Properties  of  trace  equivalence 

The  following  lemma  lists  a  few  properties  of  trace  equivalence.  The  list  is 
incomplete,  but  suffices  for  the  purposes  of  this  thesis. 
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Lemma  2.1  (Properties  of  trace  equivalence) 

1.  Trace  equivalence  under  T  implies  trace  equivalence  under  and  T^, 
that  is,  Cl  =r  C2  implies  Ci  =7-1  C2  and  Ci  =7-^  C2.  Moreover,  trace 
equivalence  under  implies  trace  equivalence  under  T^,  that  is,  C\  =7't 
C2  implies  C\  —q-t  6*2  • 

2.  Parallel  composition  is  associative  and  commutative,  that  is, 

[Cl  II  C2]  II  C3  =T  Cl  II  [C2  II  C3] 

Cl  II  C2  =r  C2 II  Cl 

3.  Both  sequential  and  parallel  composition  are  invariant  under  the  addition 

of  finite  stuttering,  that  is,  if  D  Cq-t  0  :  then 

C;D*  D*-,C  C  ||  D*  C. 

4.  Nesting  of  Kleene-star  operations  does  not  add  behaviour,  that  is, 

C*  =r  (C*)^ 

5.  Parallel  composition  distributes  over  disjunction,  that  is, 

[CiVC2]||C3  =r  [Ci||C3]V[C2||C3]. 

6.  Adding  a  declaration  for  a  variable  that  does  not  occur  free,  does  not 
change  the  behaviour.  Formally,  if  x  ^  /^(C)  then 

C  =r  new  X  =  e  in  C. 

7.  (Increasing  parallelism)  A  multiple  assignment  involving  two  variables 
that  do  not  occur  free  in  the  parallel  context  can  be  replaced  by  two 
parallel  simple  assignments,  if  one  of  the  variables  is  local.  Let  £*  be  a 
sequential  context.  If  neither  xi  nor  X2  nor  any  of  the  variables  in  ei  or 
62  are  free  in  C,  then 

new  xi  =  e  in  [£'[xi, X2:=ei, 62]  ||  C] 

=q-t  new  xi  =  e  in  [E[xi  :-ei  ||  X2  :=e2]  ||  C] . 

8.  Being  able  to  choose  between  two  identical  alternatives  is  like  having  no 
choice  at  all. 


C  =r  eye 


9.  Trace  equivalence  is  a  congruence,  that  is. 

Cl  =r  C2 

implies 

E[Ci]  =r  E[C2] 

for  all  contexts  E. 
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Proof:  See  Section  A.  1.1  in  the  appendix.  ■ 

Note  how  an  equivalence  involving  =7^  in  the  above  lemma  can  be  weakened 
to  an  equivalence  involving  — 7-t  or  using  the  fact  that  trace  equivalence 
under  T  implies  trace  equivalence  under  and  . 

Properties  of  trace  inclusion 

Trace  inclusion  will  prove  to  be  a  useful  reasoning  tool.  To  be  able  to  deal  with 
declarations  in  a  compositional  way,  we  define  trace  inclusion  modulo  a  set  of 
variables. 

Definition  2.7  (Trace  inclusion  modulo  V) 

A  trace  set  T\  is  included  in  another  trace  set  T2  modulo  a  variable 

Ti  C  T2  {mod  x) 

for  short,  if  for  every  trace  a  in  Ti  such  that  {x  =  v)a  for  some  v  G  Dom^ 
also  is  in  Ti,  there  exists  a  trace  /3  in  T2  such  that  {x  =  v)(3  also  is  in  T2  and 
a\x  =  f3\x. 

Given  a  set  of  variables  let  Ti  C  T2  {mod  K)  be  the  obvious  generaliza¬ 
tion.  Let  Cl  Cq-t  C2  {mod  V)  stand  for  C  T^[[C2]  {mod  V).  □ 

We  list  a  few  properties  in  the  next  lemma.  Again,  no  attempt  at  complete¬ 
ness  is  made. 

Lemma  2.2  (Properties  of  trace  inclusion) 

1.  Trace  inclusion  under  T  implies  trace  inclusion  under  and  T^,  that 
is,  Cl  C-r  C2  implies  Ci  Cq-i  C2  and  Ci  Cq-t  C2-  Similarly,  execution 
inclusion  under  S  implies  execution  inclusion  under  and  that  is, 
Cl  Cs  C2  implies  Ci  C^t  C2  and  Ci  C2. 

2.  Trace  inclusion  implies  execution  inclusion.  If  Ci  Ct  C2y  then  Ci  C2. 

3.  The  behaviour  of  a  program  C  is  subsumed  by  the  finite  iteration  C*,  that 
is,  CCr  C\ 

4.  Trace  inclusion  between  parallel  components  implies  trace  inclusion  be¬ 
tween  the  entire  parallel  compositions,  that  is,  if  Ci  Dr  C[  for  all  1  <  i  < 
n,  then 

||L,a-  2r 

5.  Trace  inclusion  between  two  atomic  statements  coincides  with  implication 
of  their  characteristic  formulas,  that  is, 

ViiPuQi]  Cr  V2:[P2,Q2] 
iff 
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6.  Trace  inclusion  is  a  congruence,  that  is, 

Cl  Cr  C2 

implies 

E[Ci]  Cr  E[C2l 

Note  that  the  congruence  property  implies  Property  4  above. 

7.  Trace  inclusion  modulo  a  set  of  variables  V  between  two  programs  Ci  and 
C2  characterizes  trace  inclusion  between  two  programs  that  differ  from  C\ 
and  C2  only  in  that  the  variables  in  V  are  declared  local.  More  precisely, 

Cl  Cr  C2  {mod  {xi, . . . ,  x„}) 


if  and  only  if 

new  xi  =  ei, , . ,  yXn  =  en  in  Ci  Cj  new  Xi  =  ei , . . . , in  C2 

for  all  e,-  with  values  in  Dorrixi ,  the  domain  of  x,-,  and  1  <  i  <  n. 

8.  (Decreasing  parallelism)  If  Ci  is  finite,  the  behaviour  ofCi  ;C2  is  subsumed 
by  Cl  II  C2. 

(a)  If  Cl  has  only  finite  traces,  then 

C1IIC2  2t  Ci;C2. 

(b)  If  Cl  through  Cn-i  have  only  finite  traces  and  i  is  not  free  in  Ci , . . . ,  Cn, 
then 


ll?=iCi  Dr  for  ?  =  1  to  n  do  C,-. 

Proof:  See  Section  A.  1.2  in  the  appendix.  ■ 

Note  how  an  inclusion  involving  in  the  above  lemma  can  be  weakened 
to  an  inclusion  involving  Qr^  or  Crx  using  the  fact  that  trace  inclusion  under 
T  implies  trace  inclusion  under  and  . 

Robust  programs 

Due  to  interference  on  shared- variables,  the  parallel  execution  of  processes  often 
leads  to  unexpected  results.  The  behaviour  of  some  programs,  however,  is  un¬ 
affected  by  parallelism.  The  program  x:=l ;  a::=l  is  equivalent  to  the  program 
x:=l  II  aj  :=1.  The  notion  of  robustness  generalizes  this  property.  Informally,  the 
semantics  of  an  n-fold  parallel  composition  of  a  robust  program  is  equivalent  to 
its  n-fold  sequential  composition.  Robustness  will  play  an  important  role  in  our 
calculus,  because  it  facilitates  the  introduction  of  parallelism. 
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Lemma  2.1 

1.  If  Cl  =r  ^^2,  then  Cl  =7-t  C2  and  Ci  =j-t  C2.  If  Ci  ='j'i  C2,  then 
Cl  ~jt  C2. 

2.  [Cl  II  C2  II  C3  -ri  Cl  II  [C2  II  C3]  and  Ci  ||  C2  C2  ||  Ci. 

3.  If  D  Htt.tt],  then  C  ;  D*  =rt  D*  -.C  =rt  C  ||  D*  =rt  C. 

4.  c*  =r  ic*y. 

5.  [Cl  V  C2]  II  C3  -r  [Cl  II  C3]  V  [C2 II  C3]. 

6.  R  X  ^  MC),  then  C  =7-  new  x  =  e  in  C. 

7.  If  £■  is  a  sequential  context  and  neither  xq  nor  X2  nor  any  of  the  variables 
in  ei  or  62  are  free  in  C,  then 

new  a^i  =  e  in  E[xi,  X2:=ei^e2]\\  C] 

—q-t  new  a:i  =  e  in  [E[xi  :=ei  ||  X2  1=62]  ||  C] . 

8.  C^rCyC. 

9.  If  Cl  =r  C2,  then  £;[Ci]  E[C2]  for  all  E. 

Lemma  2.2 

1.  If  Cl  C7-  C2,  then  Cl  Cq-^  C2  and  Ci  Cq-t  C2.  If  Ci  C^:  C2,  then 
Cl  ^£:t  C2  and  Ci  C2. 

2.  If  Cl  Cr  C2,  then  Ci  C2. 

3.  CCr  C*. 

4.  If  Ci  Dr  Cl  for  all  1  <  i  <  n,  then  \\f^^Ci  Dr  ||?=iC;. 

5.  Vi:[Pl,Ql]  Qr  V2:[P2,Q2]  iff  <^M:[P2,Q2]* 

6.  If  Cl  Cr  C2,  then  E[Ci]  Cr  E[C2]  for  all  E. 

7.  Cl  Crt  C2  {mod  {xi,.,.,  o^n})  iff 

new  xi  =  ei^ . . .  ,Xn  =  Cn  in  Ci  Cr  new  Xi  —  ei, . . .  ^Xn  =  Cn  in  C2 
for  all  €{  with  values  in  Dom^.  and  1  <  «  <  n. 

8.  (a)  If  Cl  has  only  finite  traces,  then  Ci  ||  C2  Ci  ;  C2. 

(b)  If  Cl  through  Cn-i  have  only  finite  traces,  then 

ll?=iC2  Dr  for  i  =  I  to  n  do  Cr 

Figure  2.1:  Properties  of  trace  equivalence  and  inclusion 
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Definition  2.8  (Robust  programs) 
A  program  C  is  called  robust  if 


C*  Drx  C"  =rt  irc 


for  all  n  >  1  where  C”  and  \\^C  denote  the  n-fold  sequential  composition  and 
the  n-fold  parallel  composition  respectively,  that  is,  C^  =  C  and  =  C;(7” 
and  l|iC  =  Cand  =  C  ||  [\\^  C].  □ 

Besides  ar:=l,  the  programs  x:^x  -f  1  and  x:-x  +  1 ;  x:=x  4*  1  are  also  robust. 
The  program  x:=l  ;x:=x4- 1,  however,  is  not.  Robustness  depends  on  the  level 
of  granularity.  Atomic  statements  and  finite  loops  over  them  are  always  robust. 

Proposition  2.1  (Sufficient  conditions  for  robustness) 

1.  Atomic  statements  V:[P,Q]  are  robust. 

2.  If  C  is  robust,  then  is  also  robust  for  all  m  >  0. 

3.  If  C  is  robust,  then  C*  is  also  robust. 

Proof:  1)  Let  A  =  V:[P,  Q]  be  an  atomic  statement.  We  show  the  proposition 
by  induction  over  n.  Base:  n  =  1.  Trivial.  Step:  n'  =  n  -|-  1.  Suppose  that 
=rt  irA.  We  have 


^n+l 

def 

A;A’^ 

def 

=rt 

Induction  hypothesis,  Lemma  2.1 

=rt 

A II  [  r 

Lemma  2.1 

=rt 

||”+iyl. 

def 

2)  Let  C  be  robust.  We  have  to  show  C* 

Drx  (C^r  =rt  for  all 

n  >  1.  We  have 

C* 

=rt 

def 

def 

=rt 

m 

Lemma  2.1 

=rt 

||”'"C' 

C  robust 

=rt 

r[\rc] 

Lemma  2.1 

||"C"". 

C  robust,  Lemma  2.1 

3)  The  robustness  of  C*  follows  from  the  robustness  of  C'^  for  all  m  >  1.  ■ 
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2.3  Embedding  a  standard  parallel  programming 
language 

The  standard  shared- variable  parallel  programming  language  that  was  used 
in  [00703],  for  instance,  is  embedded  into  our  setting  through  the  following 
abbreviations.  The  embeddings  for  conditionals,  while  loops,  and  the  await 
synchronization  statement  are  directly  taken  from  Brookes  [Bro96b]. 

2.3,1  Idling 

Let  B  be  a  boolean  expression.  An  idling,  or  stuttering,  step  satisfying  B  will 
be  abbreviated  by  {B}.  That  is, 

{B}  =  0:[B,B]. 

Recall  that  variables  not  mentioned  in  V  remain  unchanged  by  in  the  atomic 
statement  V:[P^Q].  Consequently,  if  V  is  empty,  the  initial  and  the  final  state 
must  be  identical.  Note  that  this  implies 

0:[B,R]  — T't  0:[^i,B]  9\:[B,tt]. 

The  embedding  of  the  skip  statement  thus  is  straightforward. 

skip  =  {ti} 


2.3.2  Assignments 

Assignment  to  a  simple  variable  x  is  encoded  as  follows: 

x:=e  =  x:[it,x 

An  array  A  with  indices  from  1  to  n  stands  for  a  set  of  variables  A[l]  through 
A[n].  An  array  assignment  A[e']:=e  is  captured  as  follows. 

A[e']:=e  =  {A[\], . . . ,  A[n]}-.[1  <  e' <  n,Q] 

where  Q  is 

Q  =  VI  <  i  <  n.A[i\  ={ 

\  A[i]^  otherwise. 

Note  that  the  above  assignment  has  no  traces  if  the  array  index  P  evaluates  to 
a  value  outside  the  array  bounds.  While  the  examples  in  the  later  chapters  of 
this  document  make  ample  use  of  arrays,  the  array  index  in  these  examples  will 
always  be  a  constant  i.  In  this  case,  the  above  encoding  simplifies  to 


A[i]  :=e  = 
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2.3*3  Conditionals 

A  conditional  is  characterized  by  two  cases  each  corresponding  to  the  execution 
of  one  the  branches.  Before  the  then-branch  is  executed,  execution  exhibits  a 
stuttering  step  in  which  the  test  evaluates  to  true.  Analogously  for  the  else- 
branch. 

if  B  then  Ci  else  C2  =  ({B}  ;  Ci)  V  ;  C2) 

2.3.4  Case  statements 

A  case  statement  abbreviates  nested  conditionals,  that  is, 

case  X  of 
eiiCil 

62  :  C2I 


6n-l  •  Cn-l| 

else  :  Cn 
end 

stands  for 

if  X  —  ei  then  Ci 
else  if  X  =  62  then  C2 

else  if  ar  =  e^-i  then  Cn-i 
else  Cn- 


2.3.5  Loops 

A  while  loop  also  is  characterized  by  two  cases:  One  in  which  the  iteration 
terminates  in  a  state  falsifying  the  loop  condition  and  one  in  which  the  iteration 
does  not  terminate. 

while  B  do  C  =  (({B}  ;  Cy  ;  {-iB})  V  ({B}  ;  Cy 

Let  C  be  a  program  in  which  i  is  an  integer  variable  that  is  only  read  and  never 
assigned  to.  Also,  let  n  be  a  constant.  Then,  a  for  loop  can  be  defined  as 

for  i  =  1  to  n  do  C  =  C[l/i] ; . . . ;  C[n/i] 

where  C[j/i]  denotes  the  program  that  is  obtained  from  C  by  replacing  all  free 
occurrences  of  i  by  j.  Also,  let 

for  i  =  n  downto  1  do  C  =  C[n/i] ; . . . ;  C[l/i]. 

Sometimes  the  loop  body  only  is  to  be  executed  when  a  certain  predicate  P 
is  satisfied.  Let 

for  i  =  1  to  n  do  C  st  B  =  for  z  =  1  to  n  do  if  B  then  C. 
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2.3.6  Declarations 

As  defined  in  Section  2.1.3,  a  local  variable  x  can  only  be  initialized  by  a  constant 
or  a  variable.  Initialization  of  x  with  an  arbitrary  expression  e  is  defined  in  terms 
of  an  assignment. 

new  X  ~  e  in  C  =  new  x  ~  v  in  x:z=e  ;C 

where  e  is  an  arbitrary  expression  over  the  domain  of  x,  and  v  is  some  value  in 
the  domain  of  x. 

The  declaration  and  initialization  of  an  array  A[l..n]  can  be  abbreviated  by 

new  A[l..n]  =  e  in  C  ~  new  A[l]  e, . . . ,  A[n]  —  e  in  C. 

We  assume  that  the  array  variables  A[l]  through  A[n]  have  the  same  domain 
associated  with  them. 

2.3.7  W-ary  parallel  compositions 

In  this  thesis  we  will  often  consider  parallel  compositions  with  an  arbitrary  but 
finite  number  n  of  components.  For  notational  convenience  we  introduce  the 
following  abbreviations. 

WUCi  =  Cl  II ...  II  c„ 

Ilf . ”^Ci  =  Ci||...||C„ 

2.3.8  Synchronization  statements 

Two  parallel  processes  can  synchronize  using  the  await  statement.  The  exe¬ 
cution  of  await  B  then  C  blocks  the  process  until  B  becomes  true  and  then 
executes  C  atomically.  To  ensure  termination,  C  is  typically  restricted  to  a  se¬ 
quence  of  assignments  to  distinct  variables.  We  will  adopt  the  same  restriction. 

await  5  then  ici  :=ei  Xn  :=en  end  —  {xi^ . .  .,Xn}:[BjQ]\/ 


where 


Q  =  xi  A...AXn 

and  all  Xi  are  distinct.  An  important  special  case  arises  when  C  is  skip. 

await  B  =  await  B  then  skip 
=  {B}  V  {^B}^ 

Note  how  the  await  statement  is  implemented  using  busy  waiting.  As  in  most 
of  the  literature,  e.g.,  [OG76a,  JonSl,  Sti88],  the  evaluation  of  expressions  and 
the  execution  of  assignments  is  assumed  here  to  be  atomic.  In  Section  2.4  we 
will  sketch  how  the  theory  can  be  extended  to  non- atomic  assignments  and 
expressions. 
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2.3.9  Message-passing  constructs 

In  most  of  the  literature  on  concurrency  theory,  shared-variable  and  message¬ 
passing  concurrency  are  given  sometimes  very  different  semantic  models,  e.g., 
[BHR84,  Bro86,  dBKPR91,  UK93b,  Bro96b].  Since  we  want  to  be  able  to 
move  freely  between  the  two  paradigms,  we  need  a  uniform  model  that  cap¬ 
tures  shared-variable  and  message-passing  concurrency  in  the  same  semantic 
framework.  Consider  the  two  standard  message-passing  primitives  c?x  and  c!e 
where  c  is  a  channel.  The  input  statement  c?x  reads  the  next  item  off  c  and  as¬ 
signs  it  to  X.  If  c  is  empty,  the  statement  blocks  until  c  is  non-empty.  Thus,  if  c 
remains  empty  forever,  the  statement  also  blocks  forever.  The  output  statement 
c!e  evaluates  the  expression  e  and  appends  the  resulting  value  at  the  end  of  c.  It 
never  blocks.  The  two  primitives  thus  receive  an  asynchronous  communication 
semantics. 

We  naturally  fit  these  two  constructs  in  our  language  by  modeling  a  channel 
c  as  variable  ranging  over  queues.  More  precisely, 

c?x  =  await  c  €  then  x:=/id(c) ;  /(c)  end 

c!e  =  c:-enqueue(c,€) 

where  c  is  a  variable  ranging  over  queues,  hd(c)  and  tl{c)  return  the  head  and 
tail  of  c  respectively  and  enqueue{c,  e)  returns  a  queue  that  is  like  c  except  that 
the  value  of  e  is  appended  at  the  end. 


2.3.10  Properties  of  embedded  programs 

Every  program  over  the  constructs  presented  as  abbreviations  in  this  section 
not  only  has  a  non-empty  set  of  transition  traces  but  also  a  non-empty  set  of 
executions. 

Proposition  2.2  (Programs  with  non-empty  denotations) 

If  C  consists  of  assignments,  sequential  and  parallel  compositions,  while  and  for 
loops,  await  statements,  local  variable  declarations,  and  the  message-passing 
primitives  only,  then 

1.  C  has  a  non-empty  denotation  under  T,  that  is, 

nci  7^  0, 

2.  C  has  a  non-empty  denotation  under  £,  that  is, 

^ ICl  ^  0. 

Note  that  the  above  two  properties  also  imply  non-empty  denotations  under  , 

r^^^and  £K 

Proof:  We  say  a  program  C  is  complete  if  for  every  natural  number  n,  and  for 
every  sequence  of  states  so>  ♦  -i  ^n,  there  exists  a  transition  trace  in  TfCJ 
of  the  form 


(so,So)(si,si)...(s„,4)a. 
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We  can  prove  by  structural  induction  that  every  program  of  the  form  described 
above  is  complete.  The  first  property  then  is  an  easy  corollary.  Repeated 
applications  of  the  completeness  property  then  show  that  for  every  complete 
program  C  and  every  initial  state  sq,  C  has  an  execution  starting  in  that  state. 
The  second  proposition  follows.  ■ 

The  notion  of  program  defined  in  this  section  allows  for  the  expression  of  pos¬ 
sibly  non-terminating  programs  that  use  the  standard  constructs  over  a  simple, 
imperative  language,  fair  parallelism  and  local  variables  and  channels.  Means  of 
synchronization  and  communication  are  shared- variables,  message-passing  and 
the  await  statement. 


2.4  Modeling  fine-grained  concurrency 

So  far,  assignments  and  expression  evaluation  have  been  considered  atomic. 
While  this  rather  high  level  of  granularity  can  be  achieved  in  realistic  paral¬ 
lel  implementations  through  the  use  of  more  low-level  atomic  statements,  the 
loss  of  efficiency  typically  is  prohibitive.  Moreover,  coarse-grained  parallelism 
may  represent  poor  use  of  resources.  In  this  section,  we  will  relax  the  atom¬ 
icity  constraint  and  achieve  a  finer-grained,  more  realistic  level  of  concurrency. 
For  instance,  we  might  allow  interruption  of  an  assignment  x:=.e  during  the 
evaluation  of  e,  and  interruption  of  a  conditional  or  a  while  loop  during  the 
evaluation  of  its  test.  Note  that  the  introduction  of  non- atomic  expressions  has 
significant  and  sometimes  surprising  consequences.  For  instance,  the  evaluation 
of  done  V  -^done  ot  x  =  x  may  now  return  ff  ^  because  the  values  of  variables 
done  or  x  may  be  altered  concurrently.  Similarly,  the  evaluation  of  x  x  where 
X  is  an  integer  variable  may  yield  an  odd  integer.  Laws  of  programming  that 
are  usually  taken  for  granted,  cease  to  hold.  Finer  levels  of  granularity  further 
increase  the  complexity  of  parallel  programming. 

Expression  traces.  Brookes  has  shown  how  the  transition  trace  semantics  can 
be  adapted  straightforwardly  to  model  fine-grained  parallelism  [Bro96b].  The 
key  idea  is  to  extend  the  semantics  such  that  expressions  denote  sets  of  traces 
of  the  form 

((so,so)(si,si)...(s„,s„),v) 

where  each  of  the  Si  is  a  state  and  v  is  a,  value.  The  intuitive  meaning  of  such  a 
trace  is  that  the  evaluation  of  e  is  started  in  state  sq?  interrupted  n  times,  where 
the  2th  interruption  changes  the  state  from  Si^x  to  and  finally  results  in  value 
V.  For  simplicity  we  will  assume  that  the  evaluation  of  expressions  always  ter¬ 
minates.  This  idea  can  readily  be  extended  to  labeled  transition  traces.  To 
illustrate  the  basic  approach,  we  restrict  attention  to  boolean  expressions  over 
the  constants  tt  and  ff,  variables,  negation,  and  conjunction.  Analogous  defi¬ 
nitions  can  be  made  for  other  boolean  connectives  and  arithmetic  expressions. 
First,  the  closure  operations  must  be  extended  to  labeled  boolean  expression 
traces.  This  is  straightforward  and  omitted.  Let  'P^((E  x  A  x  E)+  x  {tt^ff})  be 
the  set  of  closed  sets  of  boolean  expression  traces. 
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Extending  the  semantics.  We  then  extend  the  semantic  map  such  that 
it  maps  boolean  expressions  B  to  V^{{E,  x  A  x  x  {ii,ff})^  Because  we  will 
later  illustrate  and  compare  the  impact  of  different  evaluation  strategies  on  the 
program  development  process,  we  introduce  four  different  kinds  of  conjunctions. 
Bi  A  B2  stands  for  standard,  atomic  conjunction.  Si  Air  S2  evaluates  its  argu¬ 
ments  from  left  to  right.  Si  Ari  S2  evaluates  its  arguments  from  right  to  left, 
and  Si  Ap  S2  evaluates  its  arguments  in  parallel.  Let  Tj^  for  /  G  A  be  defined 
as  follows. 


[jBi  a  52I 


[Si  A(r  B2I 


7y*lSi  Ar;  S2I 


T^\Bi  Ap  Si] 


{{{s,l,s),tt)  I  s  G 

{{{s,l,s),v)  I  s(a;)  = 

I  (a.v)  G 

{{(s,i,s),ff)  I  ({s,i,s),ff)  €  7;nsi|u7;*[S2i}t 

u{{(s,i,s),tt)  I  ((s,i,s),tt)  e7;^iBi}r\T^lB2j]i 

{(a,#)IK#)e7;^[Si]}t 

U{{al3,v)  1  (a,U)  €  TJ^lSiJ  A  {0,v)  G  7y*[S2|}t 

uUa/3,v)  I  (a,«)  e7;*IS2|A(A^')  e7;*ISii}t 
{(7,  t;i  A  V2)  I  (a,  vi)  G  T^^ISi]  A  (/?,  V2)  G  7;*lS2l 
A7  G  o||/?}^ 


Intuitively,  |[S|  records  which  finite  stuttering  sequences  cause  expression  S 
to  be  evaluated  to  which  value.  For  instance,  the  trace  set  for  the  boolean 
expression  x  Air  contains  traces  of  the  form 


{{xji,x){-^x,l2,-'x),tt), 


where  x  and  -^x  stand  for  states  in  which  the  value  of  x  is  it  and  ff,  respectively. 
In  other  words,  in  an  environment  that  resets  x  between  evaluation  of  the  first 
and  second  conjuncts  xAir'^x  can  evaluate  to  true.  As  another  example,  consider 
the  boolean  expression  x  Air  y-  It  has  the  trace 

{{x  A-^yJi,x  A-^y)  (-^x  Ay^h.-^x  Ay),  it) , 

that  is,  the  evaluation  of  x  Air  y  can  yield  true  without  ever  passing  through 
a  state  in  which  both  x  and  y  are  true  simultaneously.  Note  that  left-to-right 
evaluation  is  equivalent  to  right- to-left  evaluation  on  commuted  arguments,  that 
is, 


Si  Air  S2  =7"*  S2  Arl  Si 

for  all  boolean  expressions  Si  and  S2. 

Extending  the  syntax.  To  extend  our  framework  appropriately  we  need  to 
give  the  user  a  way  to  use  these  evaluation  traces  for  specification  purposes.  To 
this  end,  we  augment  our  language  with  the  statement  {S  ^  v}  where  S  is  a 
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boolean  expression  and  v  is  a  boolean  value.  Formally,  we  extend  the  definition 
of  labeled  and  unlabeled  programs  C  and  D  in  Section  2.1.3  by  the  clauses 

C 

D  ::= 

where  v  £  {U^ff].  We  will  leave  the  scope  of  labels  and  the  set  of  contexts 
unchanged.  Note,  however,  that  the  framework  could  easily  be  extended  to 
allow  labeled  expressions,  contexts  with  their  hole  in  the  place  of  an  expression, 
and  thus  for  refinement  of  expressions.  {5  JJ.  stands  for  all  finite  stuttering 
sequences  that  cause  B  to  be  evaluated  to  that  is, 

=  {a\{a,v)erHBl}. 

Adapting  the  embedding.  The  encoding  of  assignments  to  boolean  variables, 
conditionals  and  while  loops  can  be  rephrased  as  follows: 

X  :=B  —  {5  JJ.  it}  ;  x:[tt^  j?]  V  ff}  ;  x:[U,  -la:] 

if  5  then  Cl  else  ^2  =  {{B  ^  it} -.Ci)  W  {{B  i}  ff}  C2) 

while  B  do  C  =  {{{B  Jj.  it}  ;  C)*  ;  {B  ff})  V  {{B  it}  ;  C)^. 

Now,  the  execution  of  assignments  may  be  interrupted  during  the  evaluation  of 
the  right-hand  expression.  The  execution  of  conditionals  and  while  loops  may 
be  interrupted  during  the  evaluation  of  the  test.  In  Section  4.1.1  we  will  see 
how  evaluation  strategies  can  be  compared  using  trace  inclusion. 

The  following  lemma  collects  a  few  properties  of  fine-grained  boolean  ex¬ 
pressions  that  we  will  need  later. 

Lemma  2.3  (Properties  of  fine-grained  boolean  expressions) 

1.  An  expression  trace  evaluates  a  negation  ->5  to  value  v  if  and  only  if  it 
also  evaluates  B  to  -iv,  that  is, 

v}  =7-4;  {B  -li;} 

for  all  boolean  expressions  B  and  values  v  £  {tt,ff}. 

2.  The  DeMorgan  laws  also  hold  for  fine-grained  boolean  expressions.  We 
only  list  the  laws  we  will  later  need. 

-i-iJ9  —7-^  B 
Air  B2)  ='ft  ^'Bi  Vir  -'B2 


Proof:  Directly  from  the  definitions. 
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2.5  Discussion 

This  concludes  our  presentation  of  the  syntax  and  semantics  of  our  language. 
We  conclude  that  the  language  supports 

•  fair  parallel  computation, 

•  shared-variable  and  message-passing  concurrency, 

•  local  variables  and  channels, 

•  the  expression  of  reactive  systems. 

Moreover,  the  transition  trace  semantics 

•  is  easy  to  work  with,  because  it  is  compositional.  The  semantics  of  a 
composite  program  is  determined  solely  in  terms  of  the  semantics  of  its 
constituent  programs.  Moreover,  the  treatment  of  the  standard  program¬ 
ming  constructs  is  reminiscent  of  extended  regular  expressions  and  thus 
rather  intuitive  and  mnemonic. 

•  is  robust.  Finer  levels  of  granularity  or  other  language  constructs  can  be 
modeled  rather  straightforwardly. 

•  validates  standard  laws  of  concurrent  programming  (Lemmas  2.1  and  2.2). 

•  is  fully  abstract  for  standard,  shared- variable  parallel  programs.  In  other 
words,  it  is  at  the  right  level  of  abstraction  compared  to  the  standard 
notion  of  operational  behaviour  [Bro96b].  It  thus  avoids  unnecessary  dis¬ 
tinctions  between  programs  like  C  and  C  ;  skip,  for  instance. 

•  does  not  allow  reasoning  about  complexity.  For  instance,  all  finite  amounts 
of  stuttering  are  equated,  so  that,  skip  =7-1  skip*. 

Due  to  these  properties,  and  f  1  will  be  used  as  our  semantic  modeling 
tools  in  this  thesis.  The  term  “closed”  will  stand  for  “sm-closed”  unless  noted 
otherwise.  The  use  of  T  and  will  mostly  be  confined  to  proofs. 


Chapter  3 


Assumption-commitment 

reasoning 


Few  programs  execute  in  complete  isolation.  Typically,  programs  interact  with, 
for  example,  other  programs,  users,  devices,  or  sensors.  Moreover,  a  program 
usually  is  not  expected  to  accomplish  its  purpose  in  completely  arbitrary  en¬ 
vironments.  A  server  will  grant  eventual  exclusive  access  to  a  shared  resource 
only  if  other  users  always  eventually  release  that  resource.  In  its  most  general 
form,  assumption-commitment  reasoning  —  sometimes  also  called  assumption- 
guarantee  or  rely-guarantee  reasoning  —  allows  the  verification  of  a  program 
under  the  assumption  that  its  environment  behaves  a  certain  way.  In  the  con¬ 
current  setting,  assumption-commitment  reasoning  paves  the  way  towards  a 
compositional  treatment  of  concurrent  composition. 

Historically,  the  search  for  a  compositional  treatment  of  concurrency  began 
with  Owicki  and  Gries’  seminal  work.  In  [OG76a],  they  attempt  to  extend 
Hoare-logic  to  a  shared- variable  parallel  language.  More  precisely,  Hoare-triples 
are  generalized  to  proof  outlines.  A  proof  outline  is  an  annotated  program  in 
which  any  two  adjacent  statements  are  separated  by  a  predicate  describing  the 
properties  that  hold  at  that  point.  Two  proof  outlines  can  be  put  in  parallel, 
if  they  are  interference- free ^  that  is,  none  of  the  predicates  of  one  program  are 
invalidated  by  the  atomic  statements  of  the  other.  In  other  words,  the  predicates 
in  the  proof  outline  serve  as  assumptions  that  the  program  implicitly  places  on 
the  environment  it  is  going  to  be  executed  in.  The  premises  of  the  rule  for 
parallel  composition  require  the  user  to  identify  these  assumptions,  and  show 
that  each  program  respects  the  assumptions  of  the  other.  This  non-interference 
check  ensures  soundness,  but  also  makes  the  rule  non-compositional  and  thus 
unsuitable  for  program  development.  In  [JonSl],  Jones  addresses  this  problem 
by  using  rely-  and  guarantee-conditions  to  state  explicitly  the  assumptions  and 
the  guarantees  of  a  program.  Formulas  are  of  the  form 

C\={P,R,G,Q) 
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where  C  is  a  program  and  (P,  P,  G,  Q)  is  a  specification  consisting  of  four  pred- 
icates  P,  P,  G  and  Q,  The  precondition  P  and  the  rely-condition  P  constitute 
the  assumptions  that  G  can  make  about  its  environment.  In  return  G  must 
satisfy  the  post-condition  Q  and  the  guarantee-condition  G.  Note  that  P  and 
G  are  binary  in  the  sense  of  Section  2.1.2  to  allow  for  the  description  of  pairs 
of  states.  Program  G  satisfies  the  rely-guarantee  tuple  (P,  P,  G,Q)  if  C  termi¬ 
nates  in  a  state  satisfying  Q  and  all  program  transitions  satisfy  G,  whenever  the 
initial  state  satisfies  P  and  all  environment  transitions  satisfy  P.  This  notion 
naturally  extends  Hoare-triples  {P}  G  {Q}  for  total  correctness  from  sequen¬ 
tial  programming.  Rather  than  placing  assumptions  on  the  initial  state  only, 
we  also  come  up  with  assumptions  for  the  environment.  Moreover,  rather  than 
specifying  the  final  state  only,  we  also  specify  the  intermediate  behaviour  of  G. 
The  book-keeping  of  assumptions  and  guarantees  pays  off  when  formulating  the 
rule  for  parallel  programs. 

Gi  1=  (P,  Pi,  Gi,  Qi)  G2  N  (fi  Q2)  C?2  =>  Pi  Gi  =»  P2 

Gi  II  G2  N  {P.  Ri  A  P2,  G:  V  G2,  Qi  A  Q2) 

The  assumptions  of  one  program  have  to  be  implied  by  the  guarantees  of  the 
other  and  vice  versa.  Informally,  the  two  programs  running  in  parallel  have  to 
be  shown  to  respect  each  other’s  needs. 

Using  Jones’  idea  of  explicitly  stating  the  assumptions  and  the  commitments 
of  each  program,  Stirling  generalized  Owicki  and  Gries’  logic  [Sti88].  In  his 
setting,  the  formula 

[P,T]  C  [Q,A] 

expresses  that  if  the  initial  state  satisfies  P  and  the  parallel  environment  pre¬ 
serves  all  the  predicates  in  F,  then  G  will  terminate  in  a  state  satisfying  Q  while 
also  preserving  the  predicates  in  A.  The  parallel  rule  then  takes  on  the  following 
shape. 

[P,Ti]  Cl  [Qi,Ai]  [P,r2]  C2  [Q2,A2]  TiCAz  F;  C  Ai 
[p,riur2]  C1IIC2  [QiAQ2,AinA2] 

Compared  to  Owicki  and  Gries  approach,  Stirling’s  formulation  is  compositional 
and  more  general.  The  following  definition  will  adapt  Stirling’s  compositional 
formulation  of  assumption-commitment  reasoning  to  our  setting. 

Definition  3.1  (Assumption-commitment  formulas) 

1.  Let  $  and  s'  be  two  states,  I  be  a  label,  P  a  unary  predicate  and  F  be  a 
set  of  unary  predicates.  We  say  that  (s,/,s')  preserves  P, 

(s,  I,  s')  1=  pre  P, 


for  short,  if 


=>•  P. 
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We  say  that  (s,  I,  s')  preserves  F, 

{s,l,s')  |=pre  r, 


for  short,  if 


(s,/,s')  \:^pre  P, 

for  all  P  G  r.  pre  P  and  pre  T  should  thus  viewed  as  a  binary  predicates. 


2.  Let  a  =  {soJo,Sq){si,Ii^s[)  ...  be  a  labeled  transition  trace 

♦  Let  P  be  a  unary  predicate.  We  say  that  a  satisfies  the  assumptions 
P  and  r, 

a  1=  assump{Pj  F), 

for  short,  iff  the  first  state  satisfies  P  and  F  is  preserved  across  all 
gaps  along  Of,  that  is, 

—  So  1=  P  and 

-  (s'-,  Sj+i)  1=  pre  F  for  all  0  <  i  <  length{a) 

where  length{a)  stands  for  the  number  of  pairs  in  a  minus  1.  Re¬ 
member  that  a  may  be  infinite. 

•  Let  (5  be  a  unary  predicate  and  let  A  be  a  set  of  unary  predicates. 
We  say  that  a  satisfies  the  guarantees  Q  and  A, 

a  1=  guar{Q,A), 

for  short,  iff  the  last  state  in  a  (if  it  exists)  satisfies  Q  and  A  is 
preserved  across  all  transitions  along  a,  that  is, 

-  last{a)  ^  Q,  if  Of  is  finite  and 

-  (5j,  s^)  1=  pre  A  for  all  0  <  i  <  length{a) 

where  last{a)  denotes  the  last  state  of  a,  if  a  is  finite. 

3.  We  say  that  a  guarantees  Q  and  A  under  assumptions  P  and  F, 


[P,T]  a  [g,A] 


for  short,  iff 
implies 


a  1=  assump{P,  F) 
a  1=  guar{Qj  A). 


4.  Let  T  be  a  set  of  transition  traces. 
tions  P  and  F, 

[^.r] 


for  short,  iff 


T  guarantees  Q  and  A  under  assump- 
T  [Q,A] 

a  [g,A] 


for  all  a  ET. 
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5.  Let  C  be  a  program.  C  guarantees  Q  and  A  under  assumptions  P  and  F, 

[P,T]  C  [Q,A] 

for  short,  iff 

[p,r]  rUci 

6.  We  will  call  [F,  F]  C  [Q,  A]  an  assumption- commitment  formula  or  assump¬ 
tion-commitment  specification,  □ 

Example  3.1  (Assumption-commitment  formulas) 

Let  C  be  the  following  program  to  multiply  two  numbers  stored  in  x  and  y  by 
repeated  addition.  The  result  is  to  be  stored  in  muL 

C  =  mu/:=0; 

new  cut  =  y  in 
while  cnt  >  0  do 
mul  :=mul  -f  x; 
cnt  izzcnt  —  1 
od 
end 

We  assume  that  x  and  y  are  initialized  with  two  natural  numbers  m  and  n 
respectively.  Clearly,  C  only  computes  n  •  m  if  certain  restrictions  are  placed 
on  the  way  the  parallel  environment  treats  muL  The  assumptions  {mul  =  v  | 
V  E  Domjnui  A  0  <  u)  would  preserve  all  predicates  mul  =  v  with  0  <  v. 
Consequently,  the  environment  could  not  change  the  value  of  mul  at  all  and 
correctness  of  C  would  follow.  Interestingly,  however,  it  suffices  to  assume  that 
the  environment  only  preserves  the  value  of  mul  if  it  is  a  multiple  of  m  between 
0  and  n  •  m. 

[x  =  mAy  =  n,{x  =  mAy  =  n}\J  Fy^iti/] 

C 

[mu/  =  n'mAa?  =  mAy  =  n,  Preds{  Var\{mu/})] . 

where 

Fmti/  =  {mul  =  i;  I  i;  G  Dom^ui  AO  <  v  <  n  *  m  Av  mod  m  =  0}. 

Under  these  assumptions  C  will  leave  the  desired  result  in  mul.  Moreover,  since 
C  only  changes  mu/,  it  will  leave  all  other  variables  unchanged.  Consequently, 
C  will  preserve  all  predicates  with  free  variables  in  Var\{mu/},  that  is,  all 
predicates  in  Preds{  Var\{mu/}).  Note  that  if  there  is  an  upper  bound  for  m  and 
n,  that  is,  both  values  are  known  to  be  below  a  certain  maximal  value  max,  then 
the  set  of  assumptions  Tmui  becomes  finite  and  will  contain  precisely  maxjn 
predicates.  Also  note  that  C  preserves  more  than  just  the  predicates  in  which 
mul  does  not  occur  free.  In  other  words,  the  set  Preds[  Var\{mu/})  of  preserved 
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predicates  is  not  complete  for  C.  For  instance,  the  predicates  mul  mod  x  =  0 
and  ar  >  0  mul  >  0  are  also  preserved  by  C.  This  example  thus  points 
us  to  a  fundamental  weakness  of  assumption-commitment  formulas.  A  finite 
representation  of  the  set  of  all  predicates  preserved  by  a  program  is  typically 
impossible.  Instead,  only  those  predicates  whose  preservation  is  essential  will 
be  mentioned.  □ 


3.1  Properties  of  assumption-commitment  for 
mulas 


We  list  a  few  useful  properties  of  assumption-commitment  formulas. 

The  first  allows  the  addition  and  removal  of  closed  predicates,  because  they 
are  always  preserved. 


Lemma  3.1  (Assumption-commitment  and  equivalent  and  closed  pred- 


i  cates) 

We  have 

[P,T]  C 

[Q,A] 

iff 

[P,ruPrec/s(0)]  C 

[Q,  A  U  Prec/5(0)] 

where 

r  =  {p'\P€Tap^p'} 

A  =  {P'\PeAAP<^P'}. 

Proof:  It  follows  directly  from  the  definitions  that  every  transition  preserves 
all  closed  predicates.  ■ 


Due  to  the  above  lemma,  equivalent  and  closed  predicates  will  not  be  ex¬ 
plicitly  mentioned  in  the  set  of  assumptions  and  guarantees  of  a  assumption- 
commitment  formula. 

The  next  lemma  allows  for  weakening  using  trace  inclusion. 

Lemma  3.2  (Weaken  assumption-commitment  formulas) 

If  Cl  C2  and 

[P,T]  Cl  [Q,A], 

then 

[p,r]  C2  [Q,A]. 


Proof:  Follows  directly  from  the  definition. 


las. 


Thus,  equivalent  programs  satisfy  the  same  assumption-commitment  formu- 
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Corollary  3.1  (Trace  equivalence  and  assumption-commitment) 
Let  Cl  =jt  C2-  We  have 

[P,T]  Cl  [Q,A] 


if  and  only  if 


[P,r]  C2  [Q,A]. 


□ 


The  next  lemma  enables  us  to  ignore  the  traces  that  arise  from  the  mumbling 
closure  condition  when  proving  an  assumption-commitment  formula.  More  pre¬ 
cisely,  if  is  set  of  traces  that  is  closed  under  stuttering  and  that  guarantees 
Q  and  A  under  assumptions  P  and  F,  then  will  also  guarantee  Q  and  A 
under  assumptions  P  and  F.  Intuitively,  this  is  because  mumbling  can  neither 
change  the  final  state  nor  introduce  a  transition  that  does  not  preserve  A. 

Lemma  3.3  (Closure  and  assumption-commitment  formulas) 

If 

[P,T]  THC}  [q,a] 

then 

[p,r]  r*[c]  [Q,A] 

and  thus 

[P,T]  C  [Q,A]. 

Proof:  Let  a  €  We  show  that 

[P,F]  a  [0,A] 

implies 

[P,r]  a'  [C.A] 

for  all  a'  that  arise  from  a  through  finite  mumbling.  The  case  for  infinite 
mumbling  is  similar.  Let 

O'  =  O'!  (sq, /,  Si ) (si , /,  S2)  .  ‘  ^n-*l)(^n  — 1  j 

a'  =  Oi{S0yl,Sn)o2 

and  let  a'  ^  as$ump{P^T),  We  need  to  show  that  a'  ^  guar(QyA).  We  also 
have  a  [=  assump{P,T),  By  assumption,  o  |=  5ftiar(Q,  A).  Consequently, 

(so  ,  /,  5i)(si ,  /,  S2)  .  .  .  (Sn-2,  ,  I,  ^n)  N  9Uar(ti,  A) 

which  implies  (soj^n)  [=  guar{tijA).  Thus,  a'  [=  guar{ttyA).  Moreover,  a'  is 
infinite  iff  a  is  infinite;  last{a')  =  last{a)  if  a  is  finite.  Thus,  a'  |=  guar{QyA). 


Note  that  the  lemma  cannot  be  strengthened  to  T,  that  is,  after  the  addition 
of  stuttering  the  trace  may  violate  the  assumption-commitment  formula  even  if 
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the  original  did  not.  To  see  this,  consider  the  trace  (5,/,  =  0])  of  program 

a::=0.  The  trace  satisfies 

[tt^  Preds{ii)]  (s,  /,  [5|a?  =  0])  [x  —  0,  Preds{  Var\{a:})] . 

However,  addition  of  the  stuttering  step  ([5|x  =  1],/,  =  1])  at  the  end,  for 

instance,  destroys  this  property. 

The  remainder  of  this  section  identifies  conditions  that  are  sufficient  for 
establishing  assumption-commitment  formulas.  Definition  3.1  defined  pre  P  as 
a  binary  predicate  that  is  satisfied  by  a  transition  if  and  only  if  the  transition 
preserves  the  validity  of  P.  We  will  abuse  notation  and  now  also  define  pre  P  to 
be  the  most  general  atomic  statement  whose  transitions  preserve  P.  Similarily 
for  pre  T. 

Definition  3.2  Given  a  predicate  P  and  a  set  of  predicates  T  C  Preds{Var)j 
we  define 

pre  P  ~  Var\[ttjP  P] 

preT  =  Var:[tt^yP  ET.  P  P] 

pre^  r  =  (pre  r)°°. 

We  say  that  program  C  preserves  T  in  all  contexts  if 

C  Cj-t  pre^T. 


□ 

Note  that  a  transition  satisfies  the  binary  predicate  pre  P  if  and  only  if  it  is  a 
trace  of  the  atomic  statement  pre  P.  Since  pre  P  is  the  most  general  atomic 
transition  that  preserves  P,  pre^T  is  the  most  general  program  that  preserves 
all  predicates  in  F. 

Lemma  3.4.1  below  makes  use  of  the  fact  that  a  program  preserves  a  set  of 
predicates  in  all  contexts.  Lemma  3.4.2  expresses  that  given  an  atomic  state¬ 
ment  the  formula 

[p,r]  ^  [g,A] 

is  true  if  the  precondition  P  and  the  characteristic  formula  of  A  imply  the 
postcondition,  and  both  P  and  Q  are  preserved  by  the  environment,  and  A 
only  contains  predicates  that  are  preserved  by  A  in  initial  states  satisfying  P. 

Lemma  3.4  (Sufficient  conditions  for  assumption-commitment) 

1.  If  C  is  known  to  preserve  a  set  of  predicates  in  all  contexts,  then  it  will 
also  preserve  them  in  a  specific  context.  Formally, 

[P,T]  C  [«,A] 

if  every  transition  of  every  transition  trace  of  C  satisfies  pre  A,  that  is, 

C  Cq-t  pre'^A. 
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2.  Let  A  be  an  atomic  statement.  If 

•  {^1 Q]  Q  r,  and 

•  (P  Ac/a)  =>  Q,  and 

•  A  C  {Q  I  (P  A  Q  AcfA  ^  <?)}, 
then 

[P,r]  A  [Q,A]. 


Proof; 

1.  Straightforward  from  the  definitions. 

2.  Let  a  6  ct  is  of  the  form  a  =  ai(s,  s')a2  where  ai  and  02  are  pos¬ 

sibly  empty,  finite  sequences  of  stuttering  steps.  Let  a  (=  assump{P^T). 
The  only  non-stuttering  transition  along  a  is  (s,  s').  Thus,  to  show  that 
all  transitions  along  a  preserve  A,  we  only  need  to  argue  that  (s,s')  pre¬ 
serves  A.  Since  P  6  F,  P  is  preserved  by  the  environment  and  thus 
not  only  the  first  state  of  a  but  also  the  state  right  before  execution  of 
A  also  satisfies  P,  that  is,  s  [=  P.  With  the  first  premise,  this  implies 
(s,s')  1=  guar(Q^A).  Since  Q  €  Q  is  preserved  by  the  environment 
and  thus  the  last  state  of  a  (if  it  exists)  also  satisfies  Q.  Consequently, 

[P,r]  a  [Q,A]. 

Lemma  3.3  implies  the  desired  result.  ■ 


Chapter  4 

Notions  of  approximation 


This  chapter  presents  different  ways  of  comparing  the  beha.viour  of  one  program 
with  that  of  another.  The  semantics  presented  in  the  previous  section  gives  rise 
to  a  natural,  powerful,  but  context-insensitive  notion  of  approximation.  It  allows 
us  to  compare  the  behaviour  of  two  programs  regardless  of  the  environment  that 
they  are  executed  in. 

However,  as  described  in  the  introduction,  our  requirements  on  a  suitable 
refinement  relation  force  the  development  of  a  context-sensitive  notion  of  ap¬ 
proximation  in  Section  4.2.  It  expresses  that  the  behaviour  of  a  program  is 
approximated  by  the  behaviour  of  another  program  in  a  particular  context. 
While  both  notions  are  useful,  it  is  the  second  that  will  form  the  basis  of  our 
refinement  calculus. 

Finally,  a  relation  on  contexts  is  defined  in  Section  4.3  that  distinguishes 
contexts  with  respect  to  their  “capabilities”  or  “discriminating  power”.  This 
relation  will  help  us  to  express  that  environment  assumptions  expressed  in  a 
given  context  are  stronger  than  those  of  some  other  context. 


4.1  Context-insensitive  approximation 

A  very  natural  notion  of  program  approximation  arises  through  transition  trace 
inclusion  Ci  €2-  The  compositionality  of  the  semantics  lends  a  lot  of  power 
to  this  notion.  The  notation  pre  P  expresses  that  predicate  P  remains  true  if 
it  is  true  initially.  No  conclusions  can  be  made  if  P  is  false  initially.  Very  often 
there  is  a  need  to  express  that  the  value  of  a  variable,  predicate,  or  expression 
does  not  change  regardless  of  the  initial  state.  To  express  this,  we  introduce  the 
inv  notation. 

Definition  4.1  Given  an  expression  e,  let  inv  e  and  inv^e  stand  for 

inv  e  =  Var:[tt,e=^] 
inv^  e  =  [inv  e)^ . 

□ 
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Thus,  inv  e  denotes  the  most  general  atomic  transition  that  leaves  the  value 
of  the  expression  e  invariant^  that  is,  unchanged.  The  program  inv^  x,  for 
instance,  is  the  most  general  program  that  never  changes  the  value  of  x. 

Given  a  predicate  P,  inv  P  comprises  all  atomic  transitions  that  do  not 
change  the  value  of  P.  Note  that  invariance  implies  preservation,  that  is, 

inv  P  Cj-t  pre  P, 


but  not  vice  versa. 

Compositionality  makes  trace  inclusion  Ci  <Zq-t  C2  a  powerful  reasoning  aid. 
For  instance,  a  program  C  always  leaves  the  value  of  x  invariant  in  all  contexts  iff 
C  Cjx  inv^  X,  Similarly,  C  always  preserves  the  invariant  I  =  mul  =  {y—cnt)'X 
in  all  contexts  iff  C  C^t  pre^  I. 

Proposition  4.1  (Invariance  and  preservation) 

1.  A  program  C  can  only  change  the  variables  that  occur  free  in  it,  that  is, 
all  variables  that  do  not  occur  free  in  C  will  be  unchanged.  Formally, 

C  Qrt  inv^[Var\fv(C)) 

for  all  C.  Note  that  this  invariance  trivially  implies  preservation  of  all 
properties  over  variables  not  free  in  C,  that  is, 

C  Cjt  pre^Preds(  Var\/t;(C')). 

2.  An  atomic  statement  A  preserves  A  in  all  contexts  if 

cfj^  =>  for  all  P  G  A.  P=>  P. 

3.  If  C  preserves  A  in  all  contexts,  then  C*  does,  too. 

Proof:  Directly  from  the  definitions.  ■ 

Lemma  3.4  in  the  previous  chapter  examined  sufficient  conditions  for  assump¬ 
tion-commitment  formulas.  The  corollary  below  combines  this  information  with 
the  lemma  above. 

Corollary  4.1  (Sufficient  condition  for  assumption-commitment) 

A  program  always  preserves  all  predicates  over  variables  that  do  not  occur  free 
in  it.  Formally, 

[P,r]  C  [«,Pr«ds(Kar\/v(C'))]. 


Proof:  Using  Proposition  4.1  and  Lemma  3.4. 
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4,1.1  Fine-grained  concurrency 

Recall  the  definitions  of  the  three  boolean  operations  Bi  Air  B2,  Bi  Ari  ^23  and 
Bi  Ap  B2  of  Section  2.4.  The  first  two  compute  the  conjunction  by  evaluat¬ 
ing  their  arguments  from  left-to-right  and  right-to-left  respectively.  The  last 
evaluates  both  arguments  in  parallel.  We  can  use  trace  inclusion  to  compare 
evaluation  strategies.  Intuitively,  Bi  Ap  B2  is  the  most  general  and  ordinary, 
atomic  conjunction  Bi  A  B2  is  the  most  restrictive.  Indeed,  as  shown  in  Fig¬ 
ure  4.1,  the  four  operations  form  a  lattice  under  trace  inclusion.  It  is  instructive 
to  see  that  these  inclusions  are  proper. 

((-I®  Ay,-<xA  y)(x  A-<y,xA  -^y),u) 
is  a  trace  of  Bi  Ap  B2,  but  not  of  Bi  Air  ^2.  Moreover, 

((x  A->y,xA  ^y)  Ay,->xAy),  tt) 

is  a  trace  of  Bi  Air  ^23  but  not  of  Bi  A  B2,  that  is,  under  non-atomic,  left-to- 
right  evaluation  the  conjunction  can  hold,  that  is,  be  evaluated  to  true,  even 
if  the  evaluation  did  not  contain  a  state  in  which  both  arguments  were  true 
simultaneously.  In  Section  5.6  we  will  revisit  non-atomic  boolean  expressions 
and  examine  the  interplay  between  evaluation  strategies  and  refinement. 


4.2  Context-sensitive  approximation 

Given  the  pleasant  metatheory  of  the  trace  semantics,  it  seems  natural  to  use 
it  also  directly  for  refinement.  However,  the  very  properties  that  make  the  se¬ 
mantics  so  well-suited  for  determining  the  meaning  of  a  parallel  program,  also 
render  it  unsuitable  as  a  basis  for  refinement.  To  see  why,  suppose  we  considered 
C2  a  refinement  of  Ci  iff  the  denotation  of  C2  under  is  contained  in  that 
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of  Cl  under  that  is,  if  Ci  C2.  Full  abstraction  as  proved  in  [Bro96b] 
means  that  we  have  Ci  C2  iff  in  all  possible  contexts  the  executions  of 
Cl  are  contained  in  those  of  C2  in  the  same  context.  Thus,  whenever  we  want 
to  do  refinement  in  a  specific  context,  trace  set  inclusion  will  typically  be  too 
strong,  because  it  does  not  incorporate  information  about  that  particular  con¬ 
text.  In  other  words,  the  suggested  notion  is  not  context-sensitive.  Consider, 
for  instance,  the  programs  Ci  =  x:=l  and  C2  =  x:=x  -f  1.  Clearly,  the  trace 
set  of  these  programs  are  incomparable,  that  is,  Ci  C'a  Ci. 

However,  if  executed  in  initial  states  with  x  =  0  and  in  parallel  contexts  that  do 
not  change  the  value  of  x  if  it  is  0,  then  every  transition  of  Ci  can  be  matched 
by  C2  and  vice  versa. 

The  notion  of  approximation  introduced  below  is  context-sensitive.  It  allows 
us,  for  instance,  to  capture  the  intended  relationship  between  Ci  and  C2  above. 
It  will  form  the  basis  of  our  refinement  relation  to  be  introduced  in  the  next 
section.  As  we  will  see,  it  generalizes  trace  inclusion.  We  will  need  the  following 
notation. 

Definition  4.2  (Execution  inclusion  modulo  V) 

Let  V  be  a  set  of  variables. 

1.  Two  executions 

a  =  S2)  •  •  • 

are  equal  modulo  K, 

a  =  /?  (mod  V) 

for  short,  if  sq  =  ^0  sind  Si  =  U  (mod  V)  for  all  i  >  I  where  s  =  t  (mod  V) 
abbreviates  Vx  G  Var\V".s(x)  =  <(x).  Note  that  a  and  fi  must  have 
matching  labels  and  identical  initial  states. 

2.  A  set  of  executions  T\  contains  another  set  of  executions  T2  modulo  a 
variable  x, 

Ti  Df-t  T2  (mod  x) 

for  short,  if  for  every  execution  a  in  Ti  there  exists  an  execution  in  T2 
such  that  a  =  P  (mod  {x}). 

Given  a  set  of  variables,  let  Ti  T2  (mod  V)  be  the  obvious  general¬ 
ization.  Also,  let 

Cl  Det  C2  (mod  V) 

stand  for  D  £^\C^  (mod  V).  □ 

In  Section  4.2.1  the  above  definition  will  be  used  to  capture  when  two  programs 
involving  a  local  variable  declaration  have  the  same  executions.  The  asymmetric 
treatment  of  the  initial  state  is  necessary  to  achieve  this  result. 
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Definition  4.3  (Context-sensitive  approximation) 

Let  Cl  and  C2  be  unlabeled  programs  and  £"  be  a  context  and  V"  be  a  set  of 
variables.  C2  approximates  Ci  with  respect  to  E  and  modulo  V, 

Cl  >E  C2  {mod  V) 

for  short,  iff 

E[{Ci)]  Det  E[{C2)]  {mod  V). 

Cl  =E  C2  {mod  V)  abbreviates  Ci  >e  C2  {mod  V)  and  C2  >e  Ci  {mod  V)  so 
that  Cl  =^E  C2  {mod  V)  iff  ^[(Ci)]  =£t  E[{C2)]  {mod  V).  Also,  Ci  '>e  C2  and 
Cl  =E  C2  abbreviate  Ci  ^e  C2  {mod  0)  and  Ci  =e  C2  {mod  0)  respectively. 

□ 

Intuitively,  Ci  >e  C2  {mod  V)  if  E  causes  C2  to  exhibit  only  transitions  that 
can  be  matched  by  Ci  modulo  V.  In  other  words,  E  cannot  force  C2  to  go 
beyond  what  Ci  can  do. 

Example  4.1  (Context-sensitive  approximation) 

I.  Consider  the  following  three  contexts. 

El  =  {x  =  0};[n\\inv^x] 

E2  =  {x  =  0};[D\\znv^{x  =  0)] 

Es  =  {x  =  0}  ;  [  □  II  pre^  {x  =  0)] 

Clearly,  in  initial  states  with  a?  =  0  and  parallel  environments  which  do  not 
change  the  value  of  x,  the  assignments  x:=l  and  x:zzx-\- I  have  matching 
transitions.  That  is, 

x:=l  x:=x  +  1. 

However,  the  assumptions  embodied  in  Ei  are  stronger  than  necessary. 
Context  £*2,  for  instance,  allows  for  x  to  change  arbitrarily  as  long  as 
the  value  of  the  predicate  a?  =  0  is  unchanged.  Thus,  an  environment 
transition  can  neither  assign  a  non-zero  value  to  x  if  its  current  value  is 
0,  nor  can  it  assign  0  to  ic  if  its  current  value  is  not  0.  That  is, 

x:-l  —^2  -f  1. 

However,  E2  is  still  unnecessarily  strong.  It  suffices  to  require  that  x  is 
unchanged  if  it  is  0.  In  other  words,  the  predicate  a:  =  0  must  be  preserved. 
This  requirement  is  expressed  in  E3.  We  have 

x:=l  —E3  x:=x  -h  1. 

Context  E3  allows  x  to  be  changed  arbitrarily  as  long  as  its  value  is  not 
0.  In  that  case,  it  must  continue  to  be  0.  In  contrast  to  E2,  the  value  of 
x  can  thus  change  from  non-zero  to  0. 


46 


CHAPTER  4.  NOTIONS  OF  APPROXIMATION 


2.  Let  £'4  =  {y  >  0}  ;[C3  ||  2:=0].  Then, 

x:[tt,x>x]  >E^  x:=x  +  y, 
but 

x:=x  +  y  x-.^t,x>x]. 

The  first  approximation  holds,  because  the  assignment  x:=a;  +  y  on  the 
right  hand  side  will  always  increase  x  since  y  is  known  to  be  greater  than 
0.  The  second  approximation  fails,  because  x:[tt,  x  >x]  can  increase  x  by 
any  value  not  only  by  the  value  of  y.  For  example,  let  y  =  1  and  x  =  v 
in  the  initial  state,  then  x:[tt,  x  >x]  has  a  transition  to  a  final  state  with 
X  =  v  +  2  that  x:=x  +  y  cannot  match. 

3.  Let  £5  =  {y  >  0}  ;  [[]  II  y:=0].  Then, 

x:[tt,x>x]  '^Ei  x:=x  +  y, 

because  in  a  state  with  y  =  0,  x  :=x  +  y  has  transitions  that  cannot  be 
matched  by  x:[tt,x  >x].  L] 

4.2.1  Properties  of  context-sensitive  approximation 

The  next  lemma  states  a  few  helpful  properties.  First,  context-sensitive  approx¬ 
imation  is  transitive.  Second,  context-insensitive  approximation  Ci  Ct-*  C2  can 
be  viewed  as  a  special  case  of  context-sensitive  approximation  Ci  <e  C2  where 
the  environment  £  is  maximally  general  and  unrestricted. 

Lemma  4.1  (Properties  of  context-sensitive  approximation) 

1.  Cl  >E  C2  {mod  V)  and  C2  >e  Cz  {mod  V)  implies  Ci  >e  Cz  {mod  V). 

2.  Cl  D-yt  Cz  iff  Cl  >E  Cz  where  £  =  []  ||  Var:[U,tt]°°. 

3.  Cl  Cz  iff  Cl  >E  Cz  for  all  contexts  £. 

Proof: 

1.  E[{Ci)]  C£t  E[{Cz)]  {mod  V)  and  £[(C2>]  £[(C3>]  {mod  V)  clearly 

imply  £[(Ci)]  £[(£3)]  {mod  V). 

2.  The  context  []  ||  Var:[tt,U]°°  has  an  important  property.  The  program 
Var:[tt,tt]^  has  the  capability  to  change  any  variable  arbitrarily.  It  can 
thus  always  realize  any  state  change  from  to  Sj+i  across  a  boundary. 
Consequently,  there  is  a  one-to-one  correspondence  between  the  transition 
traces  of  <C>  and  the  executions  of  (C)  ||  Var;[«,tt]~.  That  is,  for  every 
transition  trace 


{so,P, 
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of  a  program  (C),  there  is  an  execution 

(so,P,SQ){so,e,Si){si,p,  s'l) . . . 

of  E[{C)]  and  vice  versa. 

=>:  This  direction  follows  from  the  congruence  property.  4=:  Let  a  be 
a  trace  of  Ci  and  let  a'  be  the  corresponding  execution  of  E[{Ci)]  using 
the  property  above.  By  assumption,  a'  also  is  an  execution  of  E[{C2)]- 
Again,  by  the  above  property,  a  also  is  a  trace  of  €2- 

3.  By  congruence.  We  show  the  contrapositive.  Let  a  be  a  trace 
of  Cl  but  not  of  C2.  Then,  E[{Ci)]  has  an  execution  that  E[{C2)]  does 
not  where  £"  =  []  ||  Var:[tt,it]^ .  Using  the  above  property,  the  execution 
corresponding  to  a  is  in  .^[(Ci)]  but  not  in  £'[(C2)].  ■ 

The  above  lemma  allows  us  to  weaken  context-sensitive  approximation  by 
using  trace  inclusion  and  by  enlarging  the  set  of  ‘^modulo”  variables. 

Corollary  4.2  (Weakening  context-sensitive  approximation) 

Suppose  C2  >E  C3  {mod  V). 

1.  If  Cl  ^^2,  then  Cl  >e  C3  {mod  V). 

2.  If  C3  Dj't  C4,  then  C2  >e  C4  {mod  V). 

3.  If  U  C  U',  then  C2  >e  C3  {mod  U'). 

□ 

The  presence  of  labels  in  traces  gives  rise  to  a  finer  grained  notion  of  equiv¬ 
alence.  If  a  labeled  program  is  equivalent  to  another  labeled  program,  then  the 
corresponding  unlabeled  versions  are  also  equivalent,  whereas  the  converse  is 
not  necessarily  true. 

Lemma  4.2  (Labeled  trace  inclusion  implies  unlabeled) 

For  all  contexts  E, 

1.  ^[(Ci)]  Drt  E[{C2)]  implies  E[Ci]  Dn  E[C2l 

2.  £;[(Ci)]  ^[(C2)]  implies  E[Ci]  Det  ^[^2]. 

Proof:  Both  propositions  follow  from  the  fact  that  if  two  labeled  transition 
traces  are  equal,  their  unlabeled  counterparts  will  also  be  equal.  ■ 

To  see  why  the  reverse  direction  does  not  hold,  consider  the  following  coun¬ 
terexample.  The  unlabeled  program 

await  B  ||  new  a?  =  0  in  while  it  do  x:=x  1 


is  equivalent  to 


{B}  II  new  =  0  in  while  it  do  a?  :=:a?  +  1. 
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The  right-hand  program  exhibits  infinite  stuttering  so  that  the  absence  of  the 
infinitely  stuttering  disjunct  on  the  left-hand  side  is  not  noticed.  If, 

however,  the  left  program  is  labeled,  the  equivalence  fails,  that  is, 

(await  B)  ||  new  x  =  0  in  while  it  do  x  :=ar  H- 1 

is  not  equivalent  to 

({5})  II  new  X  =  0  in  while  it  do  x:=:x  +  1. 

More  precisely,  the  first  program  contains  the  trace 

(s,  e,  s){s,p,  s)(s,  e,  s)(s,p,  s){s,  e,  s)(s,p,  s) . . . 
where  s  is  a  state  that  falsifies  5,  whereas  the  second  does  not. 

Lemma  4.3  (Execution  inclusion  modulo  V  and  declarations) 

We  have 

Cl  Cft  C2  {mod  {xi,...,  a;„}) 

if  and  only  if 


new  xi  =  ei , . . . ,  x„  =  in  (7i  new  xi  =  ei , . . . ,  Xn  =  Cn  in  C2 

where  e,-  is  a  constant  or  a  variable  for  all  1  <  t  <  n. 

Proof:  We  show  the  case  n  =  1.  The  general  case  follows. 

=>:  Let  a  be  an  execution  of  new  x  =  e  in  Ci  where  x  has  value  t;o  initially. 
Also,  we  define  the  update  of  a  trace  a  =  (soj/o,5o)(si)/i)5i)  •  •  •  by 

[a|a:  =  D]  =  ([solx  =  d],/o,  Kl®  =  «])([«ik  =  [s'll*  =  «])•••  • 

Let  V  be  the  value  of  e  in  sq.  By  definition,  Ci  has  an  execution  a'  where 
X  is  V  initially  and  a  =  [a'|x  =  uq].  By  assumption,  C2  has  an  execution 
/?'  such  that  a*  =  (mod  {x}).  Thus,  the  initial  state  of  also  satisfies 
X  =  V.  Thus  by  definition,  [/?'|x  =  vq]  is  an  execution  of  new  x  =  t;  in  C2. 
Moreover,  a  =  [a'|x  =  uq]  =  [/?'|x  =  7;o]- 

<=:  Let  (x  =  t;)a  be  an  execution  of  Ci  where  e  =  t;  and  x  =  ?;o  in  first(a). 
Then,  [a|x  =  vq]  is  an  execution  of  new  x  =  e  in  Ci  and  by  assumption 
also  of  new  x  =  e  in  (72.  By  definition  of  new,  C2  has  an  execution  (x  = 
v)^  such  that  e  =  v  and  x  =  vq  in  fir$t(l3)  and  [^|x  =  vo]  =  [a|x  =  Vo]. 
Thus,  a  =  l3  (mod  {x}).  ■ 
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4.2.2  The  power  of  context-sensitive  approximation 

While  Example  4.1  above  clarifies  Definition  4.3,  it  does  not  demonstrate  the  full 
power  of  context-sensitive  approximation.  To  this  end,  consider  the  following 
scenario.  Let  C  be  a  distributed  system.  Suppose  C  contains  a  server  component 
S  that  receives  commands  from  its  environment  via  some  channel  cmd  to  update 
a  data  structure.  More  precisely,  let  C  be  of  the  form  E'[S]  where 

S  =  new  c  =  no^op,  done  =  ff  in 
while  -^done  do 
cmd?c; 
case  c  of 
cmdi  :  Cl  \ 


cmdn  :  Cn 

end 

od 

end. 

If  the  environment  E  does  not  issue  certain  commands,  the  server  can  be  sim¬ 
plified.  For  instance,  if  E  never  outputs  commands  through  cmdn  to 

channel  cmd  with  i  <  n,  S  can  safely  be  replaced  by 

5'  =  new  c  =  no^op,  done  =  ff  in 

while  -^done  do 
cmd?c; 
case  c  of 
cmdi  :  Cl  I 

CTTldj^  .  C^i 

end 

od 

end 

that  is,  we  have 

5  =E  5' 

which  implies 

C  =  E[S]  =rt  E[S'] 

with  Lemma  4.2.  The  point  is  that  context  E  can  be  arbitrarily  complex.  It  can 
place  the  server  S  in  the  scope  of  loops,  declarations,  parallel  compositions,  or 
after  synchronization  statements.  Of  course,  the  more  complex  E  is,  the  harder 
it  probably  will  be  to  ascertain  that  E  does  not  issue  cmdi  through  cmdn- 
Moreover,  context-sensitive  approximation  can  also  serve  as  a  specification 
tool  that  allows  for  the  concise  expression  of  complex  program  properties.  Sup¬ 
pose  we  want  to  formalize  that  the  server  behaviour  in  the  given  environment 
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always  has  a  certain  property.  Let  5'  be  a  program  that  captures  this  property. 
The  behaviour  of  5  in  £*  meets  the  specification  5',  if  and  only  if  5'  >e  S.  Note 
that  the  full  power  of  trace  sets  is  available  to  express  5'. 

For  another,  more  concrete  example,  let  await  B  be  an  await  statement 
in  some  program  C,  that  is,  C  =  £'[await  B]  for  some  context  E.  Recall  that 
blocking  is  defined  as  infinite  stuttering,  that  is,  await  B  =  {B}  V 
Control  always  eventually  gets  past  await  5  in  C,  if  and  only  if  the  disjunct 
that  models  infinite  blocking  can  be  removed  without  changing  the  behaviour, 
that  is,  if  and  only  if  await  B  =£  {B}.  This  idea  will  be  crucial  in  Chapter  8  to 
formalize  that  a  mutual  exclusion  algorithm  has  the  eventual  entry  property,  that 
is,  that  every  process  that  has  started  the  entry  protocol,  will  always  eventually 
be  allowed  to  enter  the  critical  region. 


4.3  Context-approximation 

It  seems  natural  to  introduce  a  pre-order  on  contexts  that  orders  contexts  with 
respect  to  their  “discriminating power’'.  For  CCS,  for  instance,  this  was  carried 
out  by  Larsen  in  [Lar87].  In  [Din96],  we  define  Ei  □  E^  to  mean  that  context 
E2  is  at  least  as  discriminating  as  context  Ei,  We  do  the  same  here. 

Definition  4.4  (Context  approximation) 

E2  is  at  least  as  discriminating  as  Ei,  Ei  Q  E2  for  short,  if  for  all  programs  Ci 
and  C2,  Cl  <e2  C2  implies  Ci  C2.  □ 

Example  4.2  (Context  approximation)  We  revisit  the  contexts 

El  =  {x  =  0}  ;[n  \\  inv'^x] 

E2  =  {x  =  0}  ;  [□  II  inv'^{x  =  0)] 

E3  =  {a:  =  0}  ;  [[]  ||  pre'^ix  =  0)] 

from  Example  4.1.  Moreover,  let 

Eo  =  {a?  =  0}  ;  [[]  ||  while  it  do  y:=y  4- 1] 

^4  =  {x  =  0};[C]  II  Kar:[t<,«r]. 

1.  Context  E3  can  only  do  those  transitions  that  preserve  the  value  of  the 
predicate  a:  =  0,  whereas  context  E4  can  do  any  transition  at  any  time. 
Every  approximation  that  holds  with  respect  to  E4  will  also  hold  with 
respect  to  £3,  whereas  the  converse  is  not  true.  E4  is  more  general  and 
thus  has  more  discriminating  power.  Consequently,  £*3  C  £'4. 

2.  Context  Ei  can  only  do  those  transitions  that  leave  x  invariant.  Context 
£2,  however,  can  change  the  value  of  ar  as  long  as  the  predicate  x  =  0  is 
left  invariant  and  thus  is  more  discriminating.  Context  £3  in  turn  is  more 
discriminating  than  £2,  because  £3  is  allowed  to  change  x  arbitrarily,  if 
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X  =  0  is  false,  whereas  E2  has  to  leave  the  value  of  a:  =  0  unchanged. 
More  precisely,  E3  is  able  to  change  the  value  of  =:  0  from  false  to  true, 
whereas  E2  is  not.  Consequently,  Ei  H  E2  Q  Es. 

3.  Finally,  context  Eq  is  the  least  discriminating,  because  it  is  the  most 
specific.  More  precisely,  Eq  Q  Ei,  because  the  parallel  program  in  context 
Eo  never  changes  the  value  of  x. 

Consequently,  we  have 


Eo  Q  El  r  E2  c  Es  c  Ea^ 


□ 


4,3,1  Properties  of  context-approximation 

Context  approximation  formalizes  assumption-commitment  reasoning  and  thus 
allows  for  modular  proofs  of  approximations  like  Ci  >e  C2>  To  see  this,  sup¬ 
pose  we  want  to  show  Ci  >e  ^2-  Furthermore,  suppose  that  inspection  of 
the  two  programs  reveals  that  the  most  general  context  in  which  the  approxi¬ 
mation  holds  is  E' .  Then,  the  proof  of  Ci  >e  C2  can  be  reduced  to  showing 
E  E  E^ .  Context-approximation  thus  is  a  convenient  reasoning  tool.  The 
following  lemma  collects  a  few  sufficient  conditions  for  establishing  when  one 
context  approximates  another.  Enlarging  the  set  of  transition  traces  of  a  paral¬ 
lel  component  of  a  context  increases  that  context’s  discriminating  power;  that 
is,  the  resulting  context  will  be  as  least  as  discriminating.  Moreover,  weakening 
a  predicate  also  gives  the  context  more  behaviour  and  thus  more  discriminat¬ 
ing  power.  Finally,  adding  local  variables  decreases  the  discriminating  power. 
Informally,  local  variables  around  a  context  act  as  an  “equalizer” .  Consider,  at 
the  most  extreme  end,  the  context  new  xi  ei, . .  en  in  E  where  xi 

through  Xn  are  all  the  free  variables  of  Ci  and  C2.  If  Ci  and  C2  have  finite  (or 
infinite)  traces  only,  this  context  will  equate  both  programs  regardless  of  their 
behaviour,  that  is,  Ci  =£  €2-  This  is  because  fv{C)  C  {iPi, . . . ,  implies 

new  Xi  =  ei , . . . ,  iCn  =  Cn  in  C  =7-1:  skip 
if  C  has  only  finite  traces,  and 

new  Xi  =  ei , . . . ,  Xn  =  en  in  C  —7-1  while  ti  do  skip 
if  C  has  only  infinite  traces. 

Lemma  4.4  (Properties  of  context- approximation) 

For  all  contexts  E, 


1.  If  Cl  Crt  C2  then  E[n  II  Cl]  C  E[[]  ||  C2]. 

2.  If  Pi  P2  then  E[{Pi}  ;  P']  □  E[{P2}  ;  E']. 
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3.  new  X  —  V  in  E  n  E. 

Proof; 

1.  Let  Cl  Cq-t  C2  and  C  >^;[[]||C3]  for  some  C  and  C‘ .  We  need  to  show 

c  >£[D||C.]  c.  Let  a  G  €^{£[{0')  ||  Ci]}.  Suppose  a  ^  €^{£[{0}  ||  Ci]}. 
Case:  There  exists  a  longest  prefix  a'  of  a  that  can  be  extended  to  an  ex¬ 
ecution  in  both  sets.  Then,  a'  is  followed  by  a  program  transition  s') 
of  C"  that  C  cannot  match.  Since  Ci  Cq-t  (^2,  s')  is  also  the  prefix 

of  some  execution  of  E[{C^)  ||  C2]  but  not  of  E[{C)  ||  C2]  which  contradicts 
C'  >je;[Q||C2]  C*.  Case:  There  is  no  such  longest  prefix  a'  of  a,  that  is,  every 
prefix  of  a  can  be  extended  to  executions  of  £^[{C')||Ci]  and  £'[(C')||Ci]. 
Consequently,  a  is  infinite.  Moreover,  there  are  infinitely  many  program 
transitions  by  (C')  along  a.  Together,  these  program  transitions  form 
a  trace  /?  that  cannot  be  matched  by  (C)  in  the  same  context.  Since 
Cl  Cq-t  C2,  (C")  still  has  /?  in  context  £'[[]||C2].  However,  due  to  the  sep¬ 
aration  between  program  and  environment  steps  (C)  still  cannot  exhibit 
/?  in  context  E'[C3||C2].  Consequently,  a  is  an  execution  of  E[{C^)\\C2]  but 
not  of  £'[(C')||C'2],  which  contradicts  C  ^^iQUCa] 

2.  The  premise  Pi  =>  P2  implies  {Pi}  Cq-t  {P2}-  The  remainder  of  the 
argument  is  similar  to  the  one  in  the  previous  case. 

3.  Let  C  >E  C"  and  let  a\x  be  an  execution  of  new  x  =  e  in  P[(C")]  such 

that  e  =  i;  in  first{a).  We  need  to  show  that  a\x  also  is  an  execution 
of  new  X  =  e  in  P[(C)].  By  definition,  (x  =  v)a  is  an  execution  of 
P[(C')].  By  assumption,  {x  =  v)a  also  is  an  execution  of  E[{C)].  Thus, 
by  definition,  a\x  also  is  an  execution  of  new  x  =  e  in  E[{C)].  ■ 

Informally,  a  context  E  can  be  viewed  as  a  function  that  when  applied  to  a 
program  C  returns  the  program  E[C],  It  is  tempting  to  try  to  define  approx¬ 
imation  between  two  contexts  as  a  pointwise  ordering  between  the  functions 
represented  by  the  contexts,  that  is,  Ei  C  E2  if  and  only  if  Ei[C]  Cq-t  E2[C] 
for  all  programs  C.  While  Lemma  4.4.1  and  4.4.2  would  still  be  valid  under  this 
definition,  Lemma  4.4.3  would  not.  For  instance,  the  empty  context  []  would 
cease  to  be  as  discriminating  as  the  context  new  x  =  0  in  Q,  because  the  traces 
of  C  and  new  x  =  0  in  C  are  not  comparable  in  general. 

4,3,2  Game-theoretic  interpretation 

In  previous  work  [Din97]  we  use  the  simple  syntactic  structure  of  UNITY  to  in¬ 
terpret  context-sensitive  approximation  as  a  game-playing  activity.  Intuitively, 
the  game-theoretic  interpretation  of  Ci  >b  C2  is  as  follows.  Suppose  that  the 
adversary  makes  moves  in  both  the  environment  E  and  program  C2  while  the 
player  controls  Ci.  In  [Din97],  we  prove  that  Ci  >e  C2  iff  there  is  there  is  no 
sequence  of  moves,  alternating  between  player  and  adversary,  which  ends  in  a 
state  in  which  the  adversary  can  find  a  transition  of  C2  for  which  the  player 
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cannot  find  a  matching  transition  of  Ci.  In  the  light  of  this  game-theoretic 
characterization,  the  context  pre-order  Ei  □  E2  can  be  interpreted  as  com¬ 
paring  the  “repertoire”  of  moves  that  Ei  and  E2  offer.  For  example,  a  game 
involving  E2  =  □  ||  Var:[tt^tt]*  is  easier  for  the  adversary  to  win  than  a  game 
involving  =  □  ||  inv*x^  because  E2  offers  a  larger  repertoire  of  moves  for  the 
adversary.  The  context  pre-order  mentioned  above  can  then  be  interpreted  as 
ordering  contexts  with  respect  to  the  size  of  their  “repertoire”  of  moves.  The 
work  in  [Din97]  thus  gives  a  very  intuitive  game-theoretic  interpretation  of  the 
refinement  process  and  the  notions  involved. 
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Chapter  5 

Refinement 


We  now  have  the  right  tools  to  define  a  notion  of  refinement  that  meets  the 
requirements  stated  in  the  introduction.  Before  we  present  the  definition  of  the 
refinement  relation  that  our  calculus  is  based  on  in  Section  5.2,  we  sharpen  our 
intuition  by  first  considering  a  plausible,  though  ultimately  unsuitable  candi¬ 
date. 


5.1  Using  assumption-commitment  only 

In  Morris’  and  Morgan’s  refinement  calculi  a  sequential  program  C  refines  an¬ 
other  sequential  program  C  ilf  for  all  postconditions  Q,  the  weakest  precondition 
of  C  with  respect  to  Q  implies  the  weakest  precondition  of  G*  with  respect  to 
Q  [Mor87,  Mor94].  Formally, 

w'p{C,  Q)  =»  u;p(C',<5) 

for  all  Q.  Given  that  the  Hoare-triple 

{P}  C  {Q} 

holds  iff  P  wp(C^Q),  refinement  between  C  and  C"  thus  means  that  every 
Hoare-triple  satisfied  by  C  will  also  be  satisfied  by  C'.  This  is  consistent  with 
our  intuition  that  refinement  typically  means  a  decrease  in  nondeterminism. 

It  is  well-known  that,  in  the  presence  of  concurrency,  Hoare-triples  are  no 
longer  adequate,  e.g.,  [OG76a],  At  the  end  of  previous  section,  we  have  pre¬ 
sented  assumption-commitment  formulas 

[P,V]  C  [Q,A]. 

A  natural  first  attempt  to  define  a  refinement  relation  for  concurrent  programs 
would  therefore  be  to  use  these  assumption-commitment  formulas  in  the  same 
way  as  Hoare-triples  have  been  used  for  the  definition  of  refinement  for  sequential 
programs.  More  precisely,  suppose  we  define  that  C  is  refined  by  C'  iff  every 
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assumption-commitment  formula  satisfied  by  C  also  holds  for  C ,  that  is,  for  all 
P,  r,  Q,  and  A, 

[-P,r]  C  [Q,A] 


implies 


[P,r]  C'  [Q,A]. 

As  straightforward  and  intuitive  as  this  definition  is,  there  is  a  serious  problem 
with  it  that  renders  it  unsuitable  for  our  purposes. 


5,1.1  A  problem  with  context-sensitivity 

The  notion  of  refinement  suggested  above  suffers  from  the  same  drawback  as 
trace  inclusion.  It  is  not  context-sensitive.  For  all  preconditions  and  parallel 
contexts,  the  behaviour  of  the  refining  program  C'  is  a  subset  of  the  behaviour 
of  the  refined  program  C.  The  quantification  over  all  preconditions  and  paral¬ 
lel  contexts  prevents  the  refinement  notion  from  making  use  of  the  particular 
environment  assumptions  embodied  in  the  given  context  and  thus  kills  context- 
sensitivity. 

To  illustrate  this  point,  suppose  we  want  to  replace  a  complex,  high-level 
computation  with  a  sequence  of  simpler,  lower-level  ones.  For  instance,  the 
abstract  program  to  compute  the  maximum  of  two  variables  y  and  z, 

C  =  X  =  max(y^  z)] 

can  be  implemented  by  a  conditional  statement 

=  if  y  >  z  then  x:=y  else  x  :=z. 

The  problem  is  that  C  cannot  be  replaced  by  C'  in  all  contexts.  Consequently, 
there  is  an  assumption-commitment  formula  satisfied  by  the  first  program  but 
not  by  the  second.  To  show  that  C  correctly  sets  x  to  the  maximum  of  y  and  z 
it  is  sufficient  to  assume  the  preservation  of  a?  =  max  (y,z),  that  is,  we  have 

{x  =  max(y,  z)}]  {x}:[«,  x  =  max(y,  z)]  [x  =  max(y,  z),  0] . 

In  contrast,  to  show  that  C'  correctly  sets  x  to  the  maximum  of  y  and  z,  addi¬ 
tional  assumptions  are  necessary,  that  is,  the  following  assumption-commitment 
formula 

\ii^  {x  =  max(y,  z)}]  if  y  >  z  then  x  :=y  else  x  ;=z  [x  =  max(y,  z),  0] 

is  not  valid,  because  the  parallel  environment  can  change  y  or  z  right  after 
evaluation  of  the  condition  y  >  z.  Thus,  the  second  program  is  not  a  refinement 
of  the  first  in  the  sense  above.  The  problem  is  that  refinement  as  suggested 
above  thus  does  not  allow  us  to  express  that  C'  is  a  refinement  of  C  only  in 
certain  contexts  and  thus  under  certain  environment  assumptions. 
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Another  example  arises  in  a  setting  with  finer-grained  parallelism.  Consider 
x:[tt,x  even]  and  x:=x  x.  If  the  evaluation  of  a;  -f  is  not  atomic,  we  have 

[U,  {x  even]^  even]  \x  even^Preds[^)] 

but  not 

\tt^{x  even)]  x:-x-\-x  [a?  et>en,  Preds(0)] , 

because  if  x  is  odd  initially  and  then  changed  to  even  halfway  through  the 
evaluation  of  a?  +  a?,  then  the  result  will  be  odd.  However,  there  clearly  are 
contexts  in  which  it  is  safe  to  replace  the  atomic  program  by  its  non- atomic 
counterpart.  As  in  the  above  example,  the  notion  of  refinement  based  solely  on 
assumption-commitment  formulas  does  not  allow  us  to  formulate  this  situation. 
We  therefore  find  this  notion  not  suitable  for  our  purposes. 


5.2  Combining  assumption-commitment  and 
context-sensitive  approximation 

Intuitively,  we  want  a  notion  of  refinement  to  express  that  given  an  environment 
of  a  certain  shape,  the  transitions  of  the  refining  program  C'  can  always  be 
matched  by  the  refined  program  C.  To  achieve  this  we  combine  assumption- 
commitment  reasoning  and  context-sensitive  approximation.  As  in  the  previous 
attempt,  our  starting  point  is  the  assumption-commitment  formula 

[P,T]  C  [Q,A] 

of  Section  3  where  P  is  the  precondition,  T  is  the  set  of  predicates  to  be  pre¬ 
served  by  the  parallel  environment,  Q  is  the  postcondition,  and  A  is  the  set  of 
predicates  preserved  by  C.  However,  in  contrast  to  the  previous  definition,  we 
make  the  refinement  of  C  into  relative  to  the  assumptions  and  commitments 
embodied  by  [P,T]  and  [Q,  A]  respectively.  As  a  first  approximation,  we  use 

[P,r]  [Q,A] 

to  express  that  C'  refines  C  under  the  assumptions  P  and  F  and  the  commit¬ 
ments  Q  and  A.  More  formally,  assuming  that 

•  the  initial  state  satisfies  P  and 

•  the  parallel  environment  preserves  all  the  predicates  in  F, 
then 

•  C  will  be  able  to  match  every  transition  of  C'  and 

•  both  C  and  C'  preserve  all  the  predicates  in  A  and 

•  if  C  and  C'  terminate,  they  do  so  in  a  state  satisfying  Q. 
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We  thus  arrive  at  the  following  tentative  definition. 

(P,r]  cyC  [g,A] 

iff 

[P,T]  C  [«,A] 

and 

[p,r]  c  [Q,A] 

and 

C>E  C 

where  E  is  the  most  general  context  that  starts  in  a  state  satisfying  P  and 
preserves  all  predicates  in  F,  that  is, 

E  =  {P};[n||pre“r]. 

To  illustrate  the  use  of  this  relation,  consider,  for  instance,  the  programs 

C  =  x:=ar  +  l 
C'  =  x:=2. 

We  want  to  refine  C  into  C'.  Assuming  an  initial  state  that  satisfies  x  =  1  and  a 
parallel  context  that  preserves  x  =  1,  every  transition  of  C*  can  be  matched  by 
C  and  thus  C  can  be  refined  into  C"  (and  vice  versa).  If,  moreover,  the  parallel 
context  also  preserves  x  =  2  we  can  conclude  that  x  will  have  value  2  upon 
termination.  Also,  C"  preserves  a  number  of  predicates  including,  for  instance, 
y  =  n  for  all  n,  but  also  x  >  0  and  x  mod  2  =  0.  In  our  calculus  this  will  be 
expressed  by 


[x  =  1,  {x  =  1,  X  =  2}]  X  :-x  +  1  it  :=2  [x  =  2,  A] 

where  A  is  a  set  of  predicates  preserved  by  C  and  C'  under  the  given  assump¬ 
tions,  that  is. 


[x  =  1,  {x  =  l,x  =  2}]  x:=x-hl  [x  =  2,A] 

and 

[a;=  l,{a;=  l,x  =  2}]  x:=2  [x  =  2,A]. 

More  precisely,  both  C  and  C'  preserve  all  P  for  which 

{P  Ac/a;:=2  =>  P)  A  {P  Ac/a;:=x-fl  =>  P). 


That  is,  we  can  choose 

A  C  {P\(P  Acfr:=2  P)  A  (P  Ac/x:=*+l  =:►  P)}. 
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As  another  example,  consider  the  programs 

C  ~  X  :=1 ;  x:^x  -f  1 
C'  =  x:=l  '^x  :=2. 

If  the  parallel  environment  preserves  the  predicates  x  =  1  and  x  =  2^  then  C 
can  match  every  transition  of  C'  and  x  will  equal  2  upon  termination.  Formally, 

[ttj  {x  =  1,  X  =  2}]  x:=l  I  x:=x 1  y  x:=il  I  x:=2  [a?  =  2,  A] 

where  A  contains  at  most  all  predicates  that  are  preserved  by  the  three  assign¬ 
ments  a?  :=1,  x  :=a?  -{-  1,  and  a:  :=1  ;  a:  :=2,  that  is, 

A  C  {P  I  (p  Ac/^  =  1  P)  A  (p  Ac/^:=2  =» -P) 

A(P  /\cfa;  -x+l  =>  -P)}- 

Note  that  in  contrast  to  the  previous  example  no  assumptions  need  to  be  placed 
on  the  initial  state. 

Allowing  the  introduction  of  local  variables 

The  relation  suggested  above  needs  one  final  adjustment.  Currently,  it  does  not 
support  the  introduction  of  local  variables.  C  always  has  to  be  able  to  match 
the  transitions  by  C'  exactly  regardless  of  the  local  variables  that  the  refinement 
step  introduced.  To  allow  for  C  to  match  the  transitions  of  C'  modulo  a  set  of 
variables  V  we  subscript  the  refinement  relation  as  follows 

[p,r]  CyvC  [Q,A]. 

Intuitively,  this  refinement  expresses  that  C  can  match  the  transitions  of  C 
modulo  the  variables  in  V  under  the  assumptions  P  and  F  and  the  guarantees 
Q  and  A.  Suppose,  for  example,  that  we  want  to  split  the  assignment 

C  =  x:=z2‘X-i-p 

into  a  sequence  of  simpler  ones 


=  i  :=2  '  X  ;  X  :=i -i- y. 


introduces  the  auxiliary  variable  i.  Obviously,  not  every  transition  of  C'  can 
be  matched  hy  x:^2  ■  x  Py.  However,  every  transition  that  does  not  affect  the 
new,  introduced  variable  t  still  can  be  matched.  In  other  words,  C  can  match 
every  transition  of  C'  modulo  the  changes  to  i.  Formally, 

[a!  =  lAt/==lA^  =  0,r] 

x:^2-x-{-y  y{t}  t:=2  •  x  •,x:=t  A  y 
[x  =  3  A  2/  =  1,  A] 


60 


CHAPTERS.  REFINEMENT 


where 

T  =  {x  =  l,y=  l,t  =  2,x  =  3} 

and  A  is  such  that  it  contains  all  predicates  preserved  by  C  and  C'.  The 
introduction  of  local  variables  has  the  following  effect  on  our  tentative  definition 
of  refinement. 

•  Since  the  values  of  the  local  variables  may  influence  the  future  behaviour 
of  the  program,  we  want  to  be  able  to  mention  local  variables  in  the 
postcondition  Q.  However,  the  refined  program  and  the  refining  program 
may  assign  different  final  values  for  the  local  variables  which  in  general 
makes  it  impossible  to  find  non«trivial  postconditions  for  the  local  variables 
that  are  satisfied  by  both  programs.  For  instance,  C  terminates  with 
i  =  0  whereas  C'  establishes  f  =  2.  We  solve  this  problem  by  interpreting 
the  postcondition  asymmetrically.  Only  the  refining  program  C*  will  be 
required  to  establish  Q, 

♦  Global  variables  can  depend  on  local  variables.  Different  values  of  a  local 
variable  in  C  and  C"  can  cause  a  global  variable  to  take  on  different  values 
in  C  and  C‘ .  Consider,  for  instance,  the  following  invalid  refinement 
formula 

[a;  =  0,{a;  =  0,®  =  l}] 

skip;2/:=a:  x:=l  ;y:=z 

[y  =  l,Pred5(0)]. 

The  first  assignment  x:=l  is  matched  by  skip  modulo  x.  However,  the 
second  assignment  y:=x  in  a  state  with  x  =  I  cannot  be  matched  hy  y:=x 
in  a  state  with  x  =  0.  The  different  treatment  of  the  local  variable  x 
also  causes  the  global  variable  y  to  take  on  two  different  values  in  the  two 
programs.  To  remedy  this  situation,  we  require  local  variables  V  not  to 
occur  free  in  the  refined  program  C,  that  is,  fv(C)  H  F  =  0.  Since  the  local 
variables  can  always  be  consistently  renamed,  this  requirement  is  a  merely 
syntactic  restriction  that  does  not  limit  the  expressivity  of  the  refinement 
relation. 

We  are  now  ready  to  give  the  formal  definition  of  our  refinement  relation. 
Definition  5.1  (Refinement) 

Let  P,  Q  be  predicates  and  F,  A  be  sets  of  predicates,  that  is,  P,  Q  G  Pr€ds{  Var) 
and  r,  A  C  Preds{  Var).  Also,  let  V'  be  a  set  of  variables.  We  say  that  C*  refines 
C  modulo  V  under  assumptions  P  and  T  and  guarantees  Q  and  A, 

[P.r]  CyvC  [Q,A] 


for  short,  iff  we  have  that 

1.  C  is  well-formed  with  respect  to  V",  that  is,  no  variable  in  V  occurs  free 
in  C,  fv{C)  n  =  0,  and 
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2.  C  guarantees  A  under  assumptions  P  and  F,  that  is, 

[P,T]  C 

and 

3.  C'  guarantees  Q  and  A  under  assumptions  P  and  F,  that  is, 

[P,r]  C  [Q,A], 

and 

4.  C'  approximates  C  in  context  E  =  {P};[[]  \\prd^T]  modulo  V.  Formally, 

C>eC'  {modV). 

We  will  call  [P,  F]  C  yy  C'  [Q,  A]  a  refinement  formula  or  simply  a  refinement. 

□ 

Informally,  refinement  expresses  that  assuming 

♦  an  initial  state  that  satisfies  P,  and 

♦  a  parallel  context  that  preserves  the  predicates  in  F, 
then 

♦  every  transition  of  C  and  C'  will  preserve  the  predicates  in  A,  and 

♦  every  transition  of  C'  can  be  matched  by  C  modulo  the  changes  to  vari¬ 
ables  in  F,  and 

♦  if  C"  and  the  parallel  context  terminate,  they  will  do  so  in  a  state  satisfying 

Due  to  the  well-formedness  condition,  the  variables  in  V  are  only  used  by  the 
refining  program  C'.  They  are  used  by  C',  but  still  to  be  declared. 

Example  5.1  (Refinement  formulas) 

The  following  are  valid  refinement  formulas. 

1.  We  return  to  the  above  example.  We  have 

[x  —  1A2/  =  lA^  =  0,{ic=  l,yrr  =  —  3}] 

x:=2’X-\-y  t:-2  •  x x:-t  +  y 

[x  =  S  Ay  =  1  At  =  2j  Preds{  Var\{t,  a?})] . 

Note  that  the  refinement  relation  inherits  the  difficulty  of  capturing  the  set 
of  all  preserved  predicates  from  the  assumption-commitment  framework. 
Both  programs  preserve  more  than  just  the  predicates  in  which  neither  i 
nor  X  occur  free.  For  instance,  the  predicates  t  even  and  a?  -f  y  =  0  are 
also  preserved.  In  general,  only  those  predicates  will  be  mentioned  whose 
preservation  is  essential. 
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2.  Let 


Cl  =  >  0  Ay  >  0]; 

mul:[tijtt]* ;  {mul  =  y  •  x] 

C2  =  >  0  Ay  >  0]; 

new  cnt  =  y  in 

while  cni  >  0  do  mul  :=mul  +  x  ;  cnt  :^cni  ~  1  od. 

Under  the  assumption  that  the  environment  does  not  change  either  x  or 
y,  and  preserves  the  invariant  I  =  mul  =  (y  —  cnt)  •  x,  program  C2  is  a 
refinement  of  Ci . 


[«,{/}] 

Cl  ^0  C2 

[mu/  =  y  •  X,  Pr€ds{  Var\{mu/})] 

Locality  of  cn<  prevents  environment  interference,  and  ensures  termination 
of  the  computation.  Note  that  the  environment  is  allowed  to  update  x  and 
y  as  long  as  the  invariant  I  is  preserved.  Moreover,  Ci  and  C2  preserve 
all  predicates  over  the  variables  Var\{mu/}.  □ 


5.3  Properties  of  refinement 


This  section  collects  a  number  of  useful  properties  concerning  the  refinement 
relation.  Since  closed  predicates  are  always  preserved,  they  can  always  be  added 
and  removed  from  a  refinement. 


Lemma  5.1  (Refinement  and  equivalent  and  closed  predicates) 
We  have 


[p,r]  o-vC^  [<5,A] 


iff 


[P,  r  U  Preds(0)] 


CyvC'  [g,AUPreds(0)] 


where 


r  =  {p'  I  p  G  r  A  P  P'} 
A  =  {P'lPG  AAP^P'}. 


Proof:  Directly  from  Lemma  3.1  and  the  definitions. 


Due  to  the  above  lemma,  equivalent  and  closed  predicates  will  not  be  explicitly 
mentioned  in  the  set  of  assumptions  and  guarantees  of  a  refinement. 

The  next  lemma  expresses  that  refinement  is  reflexive  in  the  case  of  trivial 
commitments  and  demonstrates  how  our  refinement  notion  subsumes  assumption- 
commitment  formulas. 
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Lemma  5.2  (Reflexivity  of  refinements) 

1.  We  have 

[P,r]  [Q,A] 

iff 

[P.r]  C  [Q,A]. 

2.  Also, 

[P,r]  CyC  [tt,Preds{il})]. 

Proof:  Follows  directly  from  the  definitions.  ■ 


Refinement  between  a  sequence  of  programs  is  transitive  if  the  sets  of  free 
variables  of  the  programs  does  not  decrease.  If  the  sets  of  free  variables  are 
allowed  to  decrease,  the  transitive  refinement  may  not  be  well- formed. 

Lemma  5.3  (Ti'ansitivity  of  refinements) 

Let  Cl,  C2,  and  C3  be  programs  such  that  every  variable  free  in  Ci  is  free  in 
C2,  that  is,  fv{Ci)  C  fv(C2)-  Then, 


[P,Ti]  Ciyv,C2  [Qi,Ai] 

and 

[P,T2]  C2yv,Cz  [Q2,A2] 

implies 

[p,  riur2]  Cl  )^viuv2  C3  [Q2)AinA2] 

Proof:  We  have  to  show  that 

[p,  riur2]  Cl  [«,AinA2] 

and 

[P,riur2]  C3  [Q2.AinA2] 

and 

Cl  >E  Cz  {mod  14  U  V2) 

where  E  ~  {P} ;  [  []  ||  pre^  (Fi  U  r2)] .  The  first  two  formulas  follow  directly  from 
the  assumptions.  Let  Ei  and  E2  be  {P}  ;  [[]  ||  pre^Ti]  and  {P} ;  [[]  ||  pr€'^T2] 
respectively.  By  assumption,  Ci  >Ei  C2  {mod  Vi)  and  C2  >E2  ^3  {mod  14)- 
Moreover,  E  H  Ei  and  E  Q  E2  hy  Lemma  4.4,  that  is,  both  Ei  and  E2  are  as 
at  least  as  discriminating  as  E.  Thus, 


Cl  >E  C2  {mod  Vi  U  V2) 


and 


C2  ^E  C3  {mod  Vx  U  V2) 
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by  definition  of  context  approximation  and  Corollary  4.2.  This  implies 

Cl  >E  Cs  {mod  V^i  U  V2) 

since  context-sensitive  approximation  is  transitive.  To  see  that  the  transitive 
refinement  is  well-formed,  we  need  to  show  that  fv{Ci)n{ViUV2)  =  0.  We  have 
fv{Ci)  n  14  =  0  and  fv{C2)  H  V2  =  0  by  assumption.  Since  the  variables  free  in 
Cl  are  also  free  in  C2,  that  is,  fv{Ci)  C  fv{C2),  we  get  fv{Ci)  H  (Vj  U  14)  =  0 
as  desired.  ■ 

The  next  lemma  allows  weakening  of  refinements. 

Lemma  5.4  (Weakening  refinements) 

Suppose 

n  =  [p,r]  Ci^vC2  [Q,A] 

is  valid. 

1.  Strengthening  the  assumptions  and  weakening  the  guarantees  weakens 
that  is,  if  P'  =>  P,  r  C  F',  Q  =>  Q\  and  A'  C  A,  then 

[P',r']  Ci^vC2  [Q',A'] 


is  valid. 

2.  Enlarging  the  set  of  local  variables  weakens  IZ,  that  is,  if  1^  C  K',  then 

[P,r]  Ci^v.C2  [Q,A] 


is  valid. 

3.  Restricting  the  behaviour  of  C2  weakens  72.,  that  is,  if  C^  Sr*  ^2)  then 

[P,r]  [Q,A] 


is  valid. 

4.  Enlarging  the  behaviour  of  Ci  while  maintaining  refinement  is  a  little 
harder.  The  added  behaviour  also  has  to  meet  the  guarantees  under  the 
assumptions.  If  Ci  Cj^t  C{  and  fv{Ci)  Cfv{Ci)  and 

[p,r]  Cl  [«,A], 

then 

[P,r]  Cl^vC2  [Q,A] 

is  valid. 


Proof:  Suppose  [P,  F]  Ci  yv  C2  [QjA]. 
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1.  Assume  P'  P,T  CT' ,  Q  Q',  and  A'  C  A.  We  have  to  show 

[P',r']  [Q',A'] 

and 

[p',r']  c-i  [«.A']. 

We  prove  the  first  formula.  The  second  can  be  shown  similarly.  Let 
G  T^[C'2l  and  let  a  |=  assump{P' ,T').  Since  P'  P  and  T  C 
also  a  1=  assump{P,  F).  The  premise  then  implies  a  |=  guar{Q^  A).  Since 
Q  Q'  and  A'  C  A,  also  a  |=  guar{Q^ ^  A')  by  Lemma  3.2.  Consequently, 
[P'j  F']  a  [(5^  A']  and  Lemma  3.3  implies  the  result. 

We  also  have  to  show  Ci  >e^  C2  {mod  V)  where 

E'  =  {P^};[n  Wpre^n 

Since  P^  ^  P  and  A'  C  A,  Lemma  4.4  implies  that  is  at  least  as 
discriminating  as  E,  that  is,  E'  □  E.  Thus,  Ci  >e‘  C2  {mod  V)  as 
required. 

2.  Assume  V  CV^ ,  We  only  need  to  show  Ci  >e  C2  {mod  V')  which  follows 
directly  from  the  assumption  that  V  C  1/',  using  Corollary  4.2. 

3.  Follows  directly  using  weakening  of  context-sensitive  approximation  (Corol¬ 
lary  4.2)  and  using  weakening  of  the  assumption-commitment  formula 
(Lemma  3.2). 

4.  Follows  using  weakening  of  context-sensitive  approximation  (Corollary  4.2) 

and  the  assumptions.  ■ 

Note  that  the  above  lemma  implies  that  a  program  (7  in  a  refinement  formula 
can  always  be  replaced  by  an  equivalent  program  C\  that  is,  a  program  C'  for 
which  C  =j-t  without  invalidating  the  refinement. 

Refinement  is  also  maintained  if  the  refined  program  is  replaced  by  another 
that  preserves  the  guaranteed  predicates  in  all  contexts,  as  shown  by  the  corol¬ 
lary  below. 

Corollary  5,1  (Weakening  refinements) 

If  C[  Dj-t  Cl,  C[  Cj-t  pre^A,  and 

[P,F]  [0,A], 


then 

[p,r]  c[yvC2  [Q,A]. 

Proof:  Using  Lemma  3.4  and  Lemma  5.4.  ■ 

The  next  lemma  addresses  the  relationship  between  refinement  and  two  other 
notions  of  approximation:  trace  inclusion  and  execution  inclusion.  More  pre¬ 
cisely,  it  expresses  both  trace  inclusion  and  execution  inclusion  as  special  cases 
of  refinement. 
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Consider  the  two  sets  of  predicates  Preds(Var)  and  Pred5(0).  Preds(0)  con¬ 
tains  only  the  constant  predicates  it  and  ff  (and  their  equivalents).  Since  it 
and  ff  are  always  preserved  by  any  program,  Prerfs(0)  places  no  restrictions  and 
thus  allows  the  environment  to  change  the  state  arbitrarily.  Lemma  5.5.1  and 
Lemma  5.5.2  below  show  that  refinement  can  capture  trace  inclusion  by  placing 
only  the  trivial  assumptions  Preds(0)  on  the  environment,  that  is,  refinement 
with  respect  to  an  environment  that  preserves  no  non-trivial  predicates,  coin¬ 
cides  with  trace  inclusion.  Preds{  Var),  on  the  other  hand,  contains  all  predicates 
over  Var,  Thus,  an  environment  that  preserves  all  predicates  in  Pr€ds{  Var)  can¬ 
not  change  any  state  in  any  way.  If  we  place  the  maximal  amount  of  assumptions 
Pr€ds{Var)  on  the  environment,  we  obtain  execution  inclusion.  Lemma  5.5.3 
and  5.5.4  show  that  refinement  with  respect  to  an  environment  that  preserves 
all  predicates  implies  execution  inclusion  and  vice  versa.  All  four  lemmas  follow 
directly  from  the  definitions. 

Lemma  5.5  (Trace  and  execution  inclusion  via  refinement) 

1.  If 

[«,Pre(is(0)]  CyC  [Q,A] 
for  some  Q  and  A,  then  C  D-j-t  C. 

2.  If  C  D-r*  C',  then 

[tt ,  PredsiHl)]  CyC  [it ,  Preds{<6)] . 

3.  If 

[P,PKdsiVar)]  CyC  [Q,A] 

for  some  A,  then 

{P};CDet{P];C'  and  {P}  C' {Q} 

where  {P}  C"  {Q}  is  the  standard  Hoare-triple  notation  for  partial  cor¬ 
rectness. 

4.  If  {P};C  Dei  {P};C'  then 

[P,  Preds{  Var)]  CyC*  [tt^  Preds(0)]. 

□ 

This  lemma  shows  that  trace  and  execution  inclusion  occupy  the  two  extreme 
ends  of  the  refinement  spectrum.  For  illustration,  consider  Figure  5.1.  The 
more  restrictions  are  put  on  the  environment,  the  more  refinement  behaves 
like  execution  inclusion.  The  fewer  restrictions  are  put  on  the  environment,  the 
more  refinement  behaves  like  trace  inclusion.  Ci  C^t  C2  compares  Ci  and  C2  as 
open  systems  subject  to  unlimited  environment  interference.  On  the  other  hand. 
Cl  Cei  C2  compares  Ci  and  C2  as  closed  systems  subject  to  no  environment 
interference.  The  benefit  of  compositionality  has  to  be  paid  for  with  unlimited 
interference.  The  exclusion  of  interference,  however,  yields  a  non-compositional 
semantics. 


5.4.  THE  REFINEMENT  CALCULUS 


67 


Cl  D'ft  C2 

more 

more 

compositionality 

[F,r]  C^yC^  [Q,A] 

environment 

assumptions 

Cl  C2 

Figure  5.1:  Trace  and  execution  inclusion  as  special  cases  of  refinement 


5.4  The  refinement  calculus 

Having  given  the  refinement  relation,  we  now  present  a  collection  of  rules  that 
govern  it.  The  treatment  is  reminiscent  of  Stirling’s  proof  system  in  [Sti88]. 
Compositionality  is  achieved  through  assumption-commitment  reasoning.  The 
major  difference  is,  however,  that  assumption-commitment  reasoning  is  har¬ 
nessed  for  a  notion  of  program  refinement.  We  will  distinguish  four  kinds  of 
rules.  As  sumption- commitment  rules  allow  the  derivation  of  an  assumption- 
commitment  formula.  Basic  rules  deal  with  the  basic  constructs  in  the  language. 
Derived  rules  deal  with  the  more  standard  programming  language  constructs 
like  if  and  while.  Their  soundness  follows  from  the  soundness  of  the  basic  rules 
using  the  embedding  in  Section  2.3.  The  basic  and  the  derived  rules  are  syntax- 
directed  in  the  sense  that  the  premises  of  each  rule  involve  assertions  about 
the  proper  subprograms  of  the  program  mentioned  in  the  rule’s  conclusion.  In¬ 
troduction  rules  allow  the  introduction  of  a  new  construct  across  a  refinement 
step.  We  will  now  present  each  class  of  rules.  The  well-formedness  condition  on 
a  refinement  fv{C)  CiV  will  be  abbreviated  by  wf{Cj  V). 

5,4. 1  Assumptiori-commitment  rules 

We  present  only  one  rule  to  derive  assumption-commitment  formulas.  This  rule 
applies  to  atomic  statements  only  and  is  based  directly  on  Lemma  3.4. 

Rule  ASSCOM 

Let  A  be  an  atomic  statement.  If 

•  {P,  Q}  C  r,  and 

•  (P  Ac/a)  ^  Q,  and 
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.  AC{Q\{P  AQ  AcfA=>Q)}, 

then 

[PJ]  A  [Q,A], 


5.4.2  Basic  rules 

Each  of  the  syntactic  constructs  in  our  language  has  a  corresponding  syntax- 
directed  refinement  rule.  This  rule  is  compositional  in  the  sense  that  the  re¬ 
finement  of  the  overall  program  is  obtained  by  refining  each  of  the  immediate 
constituents.  The  basic  rules  are  summarized  in  Figures  5.4  and  5,5. 

Below  we  will  state  each  rule  and  then  briefly  explain  the  intuition  behind 
that  rule.  The  full  soundness  proofs  can  be  found  in  Section  A. 2.1. 

Rule  ATOM 

If  Ai  and  A2  are  atomic  statements  and 

1.  [P,  r]  Ai  [ff,A],and 

2.  [P,T]  A2  [Q,A],and 

3.  (3ir  1  . . .  Xfi .  p  Acj^2 )  . . .  Xfi .  p  Acj^j  ^ ,  and 

4.  wf{Ar,V), 

then 

[P,  r]  Ai  yy  A2  [(?,  A] 

where  V  =  {xi, . . . ,  Xn}* 

The  first  premise  ensures  that  Ai  guarantees  A  under  the  assumptions  P 
and  r.  The  second  premise  ensures  that  A 2  guarantees  Q  and  A  under  the 
same  assumptions.  Note  that  these  assumption-commitment  formulas  can  be 
established  using  Lemma  3.4.2.  The  last  two  premises  ensure  that  for  every 
transition  (s,  53 )  of  atomic  statement  A2  there  is  a  transition  (s,  s[)  of  Ax  such 
that  $2  coincides  with  Si  modulo  the  variables  in  V. 

Rule  SEQ 

[P,ri]  c;  [Qi,Ai]  [Qi,r2]  C2yv,c'2  [Q.A2] 

[P,  riUr2]  Cl ;  C2  yviuV2  ;  C2  [Q,AinA2] 

where  wf(Ci,V2)  and  wf{C2,Vi). 

First,  each  of  the  subprograms  is  refined  separately.  Then,  the  refinements 
are  joined  along  an  intermediate  state  satisfying  Qi,  A  predicate  needs  to  be 
preserved  in  the  overall  refinement  iff  one  of  the  subprograms  requires  it.  On 
the  other  hand,  a  predicate  is  preserved  by  Ci ;  C2  iff  it  is  preserved  by  Ci  and 
by  C2.  The  side  condition  ensures  syntactic  well-formedness. 
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Rule  OR 

[Pji]  c,  yy,  [g,Ai]  [P,r2]  C2  yy,  a, 

[P,riur2]  CiVC'2^v,uv.  c(vc'  [Q,AinA2] 

where  wf{Ci,V2)  and  wf{C2,Vi), 

The  intuition  behind  this  rule  is  similar  to  that  of  the  SEQ  rule.  Note, 
however,  that  both  refinements  of  the  components  use  the  same  precondition  P 
and  postcondition  Q. 

Rule  STAR 


[/,r]  [/,A] 

[/,r]  {cr  [/,A] 

The  refinement  of  C  into  C'  with  I  as  the  invariant  gives  rise  to  the  refinement 
of  C*  into 

Rule  OMEGA 


[/,r]  [/,A] 

[/,r]  c"'  {CT  [Q,A] 

As  in  the  case  of  STAR,  the  refinement  of  C  into  C'  with  invariant  I  gives  rise 
to  the  refinement  of  into  Due  to  the  partial  correctness  semantics 

of  the  postcondition,  the  non-terminating  program  vacuously  satisfies  any 
postcondition  Q. 

Rule  PAR 

[PiJi]  Ciyv,C[  [Qi,Ai]  [P2,r2]  C2yv,C!,  [Q2.A2] 

[Pi  AP2,riur2]  Cl  II C2  cm  [Qi  AQ2,AinA2] 

where  Ti  C  A2  and  r2  C  Ai  and  wf{Ci,  V2)  and  wf{C2,  Vi). 

This  is  where  keeping  track  of  the  assumptions  T  and  the  commitments  A 
pays  off  and  allows  the  formulation  of  a  compositional  rule.  Guarantees  and 
assumptions  have  to  mutually  imply  each  other.  The  requirements  Ti  of  Ci 
have  to  be  contained  in  the  guarantees  of  C2  and  vice  versa.  This  means  for 
every  parallel  component  Ci  that  the  parallel  environment  of  Ci  cannot  prevent 
Ci  from  meeting  its  specification.  This  rule  is  similar  in  spirit  to  corresponding 
rules  using  assumption-commitment  reasoning  (e.g.,  [JonSl,  Sti88]). 
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Rule  NEW 

_ [PJ]  Cyy  C'  [Q,A]  x^V _ 

[P[vlx],  r']  new  X  =  V  m  C  yy  new  x  =  t;  in  C'  [3x.Q,  A^] 

where 

r  =  {P[v^lx]\P  ET  eDorrij,} 

A'  =  {P,  P[t;7x]  I  P[t;7x]  €  A  A  i;'  €  jDom^}. 

This  rule  refines  a  declaration  by  refining  its  body.  This  is  one  of  only  two 
rules  that  allow  for  the  weakening  of  the  assumptions  and  the  strengthening  of 
the  commitments.  Suppose  C  is  refined  into  C'  under  assumptions  P  and  T 
and  commitments  Q  and  A.  Both  assumptions  and  commitments  may  mention 
X.  The  declaration  of  x  initializes  it  and  also  withdraws  it  from  environment 
interference  —  the  parallel  environment  cannot  change  its  value  anymore.  The 
choice  of  the  initial  value  for  x  must  be  consistent  with  P,  that  is,  the  initial 
state  must  now  satisfy  P[v/x].  Moreover,  an  assumption  P  €  F  involving  x  can 
be  weakened  to  P[i;/x]  G  F'  for  all  v  G  Dorrix.  Also,  the  value  of  x  will  not 
change  during  execution  of  new  x  =  v  in  C'.  Thus,  all  commitments  of  form 
P[v/x]  G  A  can  now  be  strengthened  to  P  G  A'.  To  use  this  rule  “backwards”, 
that  is,  to  prove 

[P',F'] 

new  X  ^  V  in  C  yy  new  x  =  v  in  C' 

we  must  find  P,  F,  Q,  and  A  such  that 

C  yv  C 
[Q,A] 

and  P'  ^  P[v/x],  F'  =  {P[v'/x]  |  P  €  F.v'  €  Dom®},  Q'  ^  Bx.Q,  and 
A'  =  {P,  P[v'/x]  I  P  e  r,  t;'  €  £»om®}. 

In  contrast  to  rule  NEW-INTRO,  x  is  not  allowed  to  occur  in  V,  that  is,  C 
has  to  be  able  to  match  every  state  change  to  x  by  C  precisely. 

Rule  WEAK 


[P’,r']  C[yv>C'2 

[p,r]  CiyvC2  [Q,A] 

where  C[  =rt  Ci,  C2  P  =>  P',  Q'  =>  Q,T'  CT,  A  C  A',  and  V'  C  V. 

This  rule  allows  us  to  strengthen  the  assumptions  and  weaken  the  com¬ 
mitments.  Moreover,  the  behaviour  of  the  refining  program  can  be  restricted 
(Ci  Cri  C2). 
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Example  5.2  (Basic  rules) 

We  demonstrate  the  use  of  some  of  the  basic  rules  using  a  small  example.  Sup¬ 
pose  we  want  to  refine 


y:[tt,  y  odd]  ||  z:[U^  z  mod  4  “  0] 


into 


x-.=2;r.=  x  +  l\\  :=10  ;  z:=x  -f  x. 


We  start  by  deducing 


IZi  =  [itj  {a:  even}] 

skip  X  :=2  asscom,  atom 

[x  even,  P7'eds{  ]/ar\{a!})] 

and 

7^2  =  even,{x  even,y  odd}] 

y:[U,  y  odd]  y  y:^x  +  1  asscom,  atom 

[y  odd,  Preds{  Var\{y})] . 

Composing  both  refinements  sequentially  yields 


IZs  =  [ti,{x  even,y  odd}] 

skip;2/:[^/,2/  odd]  y^^^)  x:=2;y:=x  1  seq(tci,tc2) 

[y  odd,  Preds{  Var\{a:,  y})] . 

Using  ATOM  and  SEQ,  we  can  derive  similarly 

7^4  =  [tt,  {x  even,  z  mod  4  =  0}] 
skip  ;  z:[tt,  z  mod  4  —  0] 

y{x}  ATOM,  SEQ 

ic  :=10  ;  :=x  +  x 

[z  mod  4  =  0,  Preds{  Vhr\{a?,  z})] . 

Putting  the  refinements  IZs  and  7^4  in  parallel  we  obtain 

77-5  =  [ti,  {x  even,y  odd,  z  mod  4  =  0}] 

[skip  ;  y:[it,  y  odd]  ||  skip  •^z:[tt,z  mod  4  =  Oj] 

[x:=2  ;  +  1  ||  x:=10  \  z:-x  x] 

[y  odd  A  z  mod  4  =  0,  Pr€ds{  Var\{a;,  y,  z})] . 


PAR('R3,TC4) 
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Lemma  2.1  says  that  £'[skip;C]  for  all  programs  C  and  contexts  E. 

Consequently, 


[skip ;  y:[tijy  odd]  ||  skip ;  z:[ti,  z  mod  4  =  0]] 

[y’-WiV  II  mod  4  =  0]], 

Using  the  above  equivalence  refinement  TZ^  can  be  weakened  to 

Tie  =  [tty  {a:  everiy  y  odd^  z  mod  4  =  0}] 

[y:[ti^y  odd]  ||  z:[ity  z  mod  4  =  Oj] 
y{x} 

[x:“2  ;  y:=:x  -h  1  ||  a::=10  ;z:=x  H-  x] 

[y  odd  A  z  mod  4  =  0,  Pr€ds(  Var\{x,  y,  z})] 

with  Lemma  5.4. 

For  an  example  involving  N  EW  consider  the  refinement 

[x  >  5  A  y  =  7,  {x  >  5,  y  =  7,  X  >  12}] 
x:[«,x>10]  >-  x:=xH-y 
[x  >  12  A  y  =  7,  Preds{  Var\{x})]. 

In  an  initial  state  satisfying  x  >  5  A  y  =  7  and  in  an  environment  preserving 
X  >  5,  y  =  7,  and  x  >  12,  the  assignment  x  :=x  +  y  will  set  x  to  a  number  larger 
than  10  and  x  will  be  greater  or  equal  to  12  upon  termination.  An  application 
of  NEW  yields 


[6  >  5  A  y  =  7,  {v  >  5,  v'  >  12  1 1;,  v'  G  N}  U  {y  =  7}] 
new  X  =  6  in  x:[tt^x  >  10]  >-  new  x  =  6  in  x:=x  4-  y 
[3x.x  >  12  A  y  =  7,  Preds{  Var)] . 

which  is  equivalent  to 

[j/  =  7,{y  =  7}] 

new  X  =  6  in  x:[^<,x  >  10]  y  new  x  =  6  in  x:=x -f  y 
[y  =  7,  Pr€ds(  Var)] 

by  Lemma  5.1.  Note  that  the  initialization  x  =  6  is  consistent  with  the  pre¬ 
condition  X  >  5  A  y  =  7,  that  is,  6  >  5  A  y  =  7  is  satisfiable.  Declaring  x  local 
shields  it  from  environment  interference,  that  is,  the  value  of  x  and  thus  also 
the  predicates  x  >  5  and  x  >  12  will  always  be  preserved  by  the  environment. 
Moreover,  all  changes  to  x  become  invisible  to  the  environment,  that  is,  both 
the  refined  and  the  refining  programs  will  not  change  the  value  of  x  and  thus 
preserve  all  predicates  involving  x.  □ 
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5.4.3  Derived  rules 

The  derived  rules  are  summarized  in  Figures  5.7  and  5.6.  We  will  now  state  each 
rule  and  briefly  explain  the  intuition  behind  that  rule.  The  soundness  proofs 
can  be  found  in  Section  A. 2. 2. 


Rule  COND 
If 

1.  [PABJi]  Ciyv,C[  [Q,Ai],and 

2.  [PA-^,r2]  C2yv,C!^  [Q,  As],  and 

3.  P  P'),  and 

4.  wf{Ci,V2)  and  wf{C2,Vi), 
then 

[p,riur2U{p,R',-p'}] 

if  B  then  Ci  else  C2  >-v^\jV2  if  B^  then  C[  else  C2 

[Q,  Ai  n  As]. 

Each  of  the  branches  is  refined  separately.  The  condition  B  can  be  replaced  by 
P'  iff  they  are  equivalent  in  initial  states  satisfying  P.  As  in  rule  SEQ  we  need 
to  enforce  that  the  local  variables  of  one  refinement  do  not  occur  freely  in  the 
other  refinement.  Since  the  refinements  of  the  two  branches  require  initial  states 
satisfying  P  A  B  and  P  A  -»P  respectively,  the  preservation  of  P,  P,  and  -iP 
needs  to  be  ensured.  To  this  end,  P,  P'  and  -nP'  are  added  to  the  assumptions 
(note  that  we  could  also  add  P,  P,  and  instead). 


Rule  FOR 
If 


1.  i  is  an  integer  constant,  and 

2.  [P[k  —  l/z],rj]  C[k/i]  yv  C'[k/i]  [P[/?/i],  Aj]  for  all  I  <  k  <  n, 


then 


for  i  ~  1  to  n  do  C  yy  for  z  =  1  to  n  do  C' 


The  loop  counter  i  has  to  be  an  integer  variable  that  is  never  assigned  to  in 
both  Ci  and  C'^.  Each  iteration  C[k/i]  of  C  is  refined  by  C^[k/i]. 
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Rule  WHILE 

[IAB,T]  CyvC  [/,A]  fv{B)r\V  =  ili 

[I,TU{B',^B']]  while  B  do  C  while  5' do  C"  [/A^B',A] 

The  refinement  of  the  loop  body  also  determines  the  invariant  I.  Condition  B 
can  be  replaced  by  S'  iff  they  are  equivalent  under  1.  Intuitively,  predicates 
I  and  S'  need  to  be  added  to  the  eissumptions,  because  the  premise  requires 
initial  states  satisfying  /  AS.  Predicate  -iS'  additionally  ensures  /  A -^B'  upon 
termination.  As  in  Rule  COND,  predicates  S  and  ->B  could  also  have  been 
added  instead. 


Rule  PAR-N 


If 


1.  [Pi,  Vi]  Ciyv.C,  [Qi,  A.],  and 
2-  r.-  C  and 

3.  wf{Cj,  Vi)  for  all  1  <  j  <  ra  with  j  ^  i, 
for  all  1  <  i  <  n,  then 


[Ar=i^.-.ur=ir.]  [Ar=iQ.-.nr=iA,]. 


This  rule  generalizes  PAR.  It  allows  the  refinement  of  an  arbitrary  number  n  of 
parallel  processes.  As  in  PAR,  SEQ  and  COND  none  of  the  local  variables  used 
in  one  refinement  can  occur  freely  in  any  other  refinement.  The  soundness  of 
PAR-N  is  shown  inductively. 


Rule  PAR-V 

[Si, Pi]  CiyC[  [Qi,A2]  [P2,r2]  C2yC’2  [Q2,A2] 

[SiAS2,riur2]  Cl  II C2  ^  c;  II  c'  [Qi  A(?2,AinA2] 

where  Pi  C  A2  and  P2  C  Ai. 

This  rule  differs  from  PAR  only  in  that  it  requires  an  empty  set  of  local 
variables.  Soundness  thus  follows  directly  from  that  of  PAR. 


Rule  PAR-V-N 


If 


1.  [Pi,Ti]  CiyC'i  [<9,-,A.],and 

2.  r.cn;=i.,,,.A,- 

for  all  1  <  «  <  n,  then 


[Ar=i^.-.ur=ir,]  ii?=i  Cl  ^  ii?=ic;  [Ar=i<?.>nr=iA.]. 


This  rule  generalizes  PAR-V  to  an  arbitrary  number  of  parallel  processes.  The 
soundness  is  shown  inductively. 
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5.4.4  Introduction  rules 

Additionally,  we  need  rules  that  allow  the  introduction  of  constructs.  These 
rules  are  given  in  Figures  5.8  and  5.9.  Again,  soundness  of  the  rules  is  proved 
in  Section  A. 2. 3.  Note  that  these  rules  require  the  set  of  local  variables  V  to  be 
empty.  The  rules  could  easily  be  extended  to  that  case  by  adding  an  appropriate 
constraint  as  in  SEQ,  COND,  or  PAR.  For  our  purposes,  however,  the  simpler 
version  suffices. 

Rule  PAR-INTRO 

If 


1.  C  is  robust  and  preserves  Af  in  all  contexts,  and 

2.  [P,  Ti]  C  y  Ci  [Qiy  Af]  for  all  1  <  i  <  n,  and 

3-  Fj  C  Hjzzl  j:^i  ^3  1*^^  ^  ^ 

4-  (VI  <  i  <  n.Qi)  =>  Q, 
then 

This  rule  is  very  important,  because  given  a  robust  program  (7,  it  allows  the 
introduction  of  parallelism.  A  loop  C*  ;  {Q}  can  be  refined  into  a  parallel 
composition  if  the  four  premises  hold. 

•  Premise  1:  As  discussed  at  the  end  of  Section  2.2.4,  the  finite  loop  C* 
over  a  robust  program  C  can  be  refined  into  a  parallel  composition 
More  precisely,  we  have 


C*  2  11?=!^ 

using  Proposition  2.1.  If,  moreover,  C  preserves  p|^.  A*  in  all  contexts,  that 
is. 


then  we  have 

[«,Prerfs(0)]  C*y\\UC  Mi^i] 
by  Lemma  3.4. 

•  Premises  2,  3  and  4:  If  for  all  I  <  i  <  n,  C  can  be  refined  into  Ci  under 
assumptions  [PijFj]  and  guarantees  [Q*,  A^]  and  each  of  the  assumptions 
Ti  are  met  by  the  commitments  of  the  parallel  environment  ||j_i 
and  the  conjunction  of  the  postconditions  of  each  of  the  parallel  compo¬ 
nents  implies  the  desired  postcondition  of  the  entire  parallel  composition, 
then  can  be  refined  into  under  assumptions  [P,  UiLi 

guarantees  [Q,nr=i 
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The  four  premises  imply  the  consequence  via  the  following  sequence  of  refine¬ 
ments. 


[^.ur=ir.] 

c* 

>- 

Premise  1 

II?=1C 

Premises  2,  3,  and  4 

[<p-nr=iA.]. 

Rule  WHILE-INTRO 
If 

1.  [j9A/,r]  CyC'  [/,A]and 

2.  (“ij?  A  /)  =>  Q  and 

3.  there  exists  an  arithmetic  expression  m  over  the  free  variables  in  B  and 

such  that  m  >  0,  and  m  =  0  =>  and  C'  always  decreases  m,  that 
is, 

C'  Cq-t  {inv*m  ;  Am  ;  tnv*m)^ 

then 

[/,rur„U{Q}]  ({5  A /}  ;  C)’ ;  {Q}  V  while  5  do  C'  [Q,A] 

where  ^  ^ 

Am  =  Var:\ti^  rh=  0  ^  m  =  0  |  m  <m] 

Tm  =  <  n  I  n  G  N} 

and  Pif  Pthen\Peise  u-bbreviatcs  i^Pif  Pthen^  A  i^~^Pif  Peise'}’  This  rule 
allows  the  replacement  of  a  finite  iteration  by  a  while  loop.  A  finite  loop 
{{B  A  /}  ;  Cy  ;  {Q}  can  be  refined  into  a  while  B^  do  C'  loop  if 

1.  the  body  C  can  be  refined  to  C", 

2.  the  negation  of  the  loop  condition  and  the  invariant  imply  the  desired 
postcondition, 

3.  This  condition  needs  a  little  explanation.  To  show  termination  of  the 
resulting  while  loop  we  recast  the  well-known  total  correctness  rule  for 
while  loops  in  trace-theoretic  terms  and  also  transfer  it  to  a  concurrent 
setting.  Remember  that  Var  denotes  all  program  variables.  Given  a 
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measure  m,  the  statement  Am  decreases  m  if  it  is  not  zero  and  leaves  it 
unchanged  if  it  is  zero.  Thus, 

:,Am  ;  inv*m)'^ 

requires  that  each  iteration  decreases  m.  Since  m  is  always  non-negative 
and  the  environment  cannot  increase  m  due  to  must  eventually  be 

set  to  0,  which  implies  and  thus  termination  of  the  loop. 

Rule  FOR-IIMTRO 

tie  that  is  never  assigned  to  and 
[k/i]  [/[/?  -f  1/z],  A/e]  for  all  0  <  A:  <  n  ~  1  and 


*]>ur=iri 

C*  ;  {Q}  y  for  2  =  1  to  n  do  C" 

[Q-nr=iA.]. 

Alternatively,  a  finite  loop  C*  ;  {Q}  can  be  refined  in  the  for  loop  for  i  = 
1  to  n  do  C",  if 

1.  the  loop  counter  i  is  an  integer  variable  and  never  assigned  to  in  C", 

2.  C  can  be  refined  to  C'[k/i\  for  each  iteration  k  using  loop  invariant  I[kji]^ 

3.  the  desired  postcondition  Q  is  implied  by  I[n/i\. 

Rule  NEW- INTRO 


If 


1.  2  is  an  integer  varial 

2.  [I[k/i],rk]  CyO 

3.  I[n/i]  =>  Q, 


then 


[^[0/ 


_ [P,r]  CyvuU}G'  [Q,A] _ 

[P[v/x],  r']  C  yv  now  X  —  V  in  C'  [3x.Q,  A'] 


where 


F'  =  {PW/x]  I  P  G  F  A  u'  €  Dom^} 

A'  =  {P,  P[v' /x]  I  P[v^/x]  G  A  aP  G  Dom^}. 

If  C  can  be  refined  into  C^  by  allowing  changes  to  a  variable  x  to  be  ignored 
and  under  some  precondition  P,  then  C  can  also  be  refined  into  new  x  — 
V  in  C'  in  initial  states  satisfying  P[v/x]  and  parallel  environments  preserving 
all  predicates  in  F'.  Like  rule  NEW,  this  rule  weakens  the  assumptions  and 
strengthens  the  commitments.  NEW- INTRO  is  the  only  rule  by  means  of  which 
the  set  of  local  variables  in  the  subscript  can  be  reduced.  It  is  a  straightforward 
consequence  of  NEW,  Lemma  2.1,  and  Lemma  5.4. 
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Rule  AWAIT-INTRO 
If 

1.  [Pi,  r]  [V:[B  A  P2,  Q2]  II  D]  [Qi,  A]  and 

2.  there  exists  an  arithmetic  expression  m  over  the  free  variables  in  B  and 
D  such  that  m  >  0,  and  m  =  0  B,  and  D  decreases  m  infinitely  often, 
or  until  m  is  0,  that  is, 

D  Cjt  {inv*m  ;  Am)^  V  ;  Am  ;  inv*m)*  ;  {m  =  0}  ;  inv*m 

then 

[Bi,rur^] 

[V:[B  A  P2,  Q2]  II  D]  y  [await  B  then  V:[P2,  Q2]  end  ||  D] 

IQi,A] 

where  Tm  =  <  n  |  n  G  N}. 

This  rule  allows  the  introduction  of  the  synchronization  statement  await 
with  condition  B,  If 

1.  V:[B  A  P2j  Q2]  II D  guarantees  Qi  and  A  under  assumptions  Pi  and  F,  and 

2.  the  synchronization  condition  B  can  be  shown  to  always  eventually  hold 
forever,  that  is,  the  parallel  program  D  in  any  context  decreases  some 
measure  m  either  infinitely  often  or  at  least  until  it  equals  0, 

then  V:[B  A  P2,Q2]  il  D  can  be  replaced  by  await  B  then  V:[P2,Q2]  provided 
the  environment  never  increases  the  measure.  The  correctness  of  this  rule  relies 
on  the  fairness  of  parallel  composition. 

Example  5.3  (Introduction  rules) 

1.  An  application  of  NEW-INTRO  to  11^  of  Example  5.2  yields 

[tt^  {y  odd^  z  mod  4  =  0}] 

[y:[ii,y  odd\  ||  z:[tt,z  mod  4  =  0]] 

y  NEW-INTRO 

new  X  =  0  in 

[x:^2  ;  2/:=x  +  1  ||  x:=:10  ;  zi^^x  +  ic] 

[y  odd  A  z  mod  4  =  0,  Preds{  Var\{y,  z})] . 

2.  In  Chapter  6.3,  a  program  to  find  the  maximum  in  an  array  A  of  integers 
using  n  parallel  processors  is  developed.  The  problem  is  broken  into  two 
sequential  parts.  First,  a  boolean  auxiliary  array  m  is  introduced  and  set 
such  that  m{i]  is  true  if  and  only  if  Afz]  is  maximal,  that  is, 

P  =  VI  <  i  <  n.m[i]  max{A)  =  ^[z]. 
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Then,  m  is  used  to  set  x  to  the  maximum  of  A.  When  deriving  the  second 
part,  we  use  PAR-ll\ITRO  to  introduce  parallelism 

[^>r] 

x:[tt,tt]*  ;  {max{A)  =  x) 

y  PAR-INTRO 

m[i]  then  x:=A[i] 

[{max{A)  =  X,  A] 

where  T  ensures  the  preservation  of  P  and  of  the  result  max  {A)  =  x. 

3.  In  Chapter  6.1  a  while  loop  is  introduced  to  compute  the  sum  T,A  over 
an  array  A. 

[tt,A] 

({^  <  n  Al]  ;  ;  {<  =  SA} 

WHILE-INTRO 

while  k  <  n  do  t:=t  -y  A[k]  ;  /?  :=A!  +  1 


where  T  is  such  that  it  preserves  the  loop  invariant  /  =  i 


4.  For  an  example  of  the  AWAlT-llMTRO  rule,  let  m  be  0,  if  the  boolean 
variable  ack  is  true,  and  1  otherwise. 


_  r  0,  if  ack 
\  1,  otherwise. 

Then,  an  await  statement  can  be  introduced  as  follows 


[ti^  {m  zr  0,  m  =  1,  done}] 

[done :[ack,  done]  ||  ack:=tt] 

y  AWAIT- INTRO 

[await  ack  then  done:[tt^done]  end  ||  ack:=it] 

[done,  Pr€ds{  Var\{ack,  done})] . 

□ 


5.4.5  Using  the  calculus 

We  say  that  a  refinement  11  was  derived  using  the  calculus,  if  every  refinement 
in  the  derivation  of  11  was  obtained  using  either  a  basic  rule,  a  derived  rule, 
or  an  introduction  rule.  Moreover,  every  application  of  ATOM  must  have  used 
the  rule  ASS  COM  to  obtain  the  assumption-commitment  formulas  of  the  atomic 
statements  involved. 

Given  a  refinement  that  was  derived  using  the  calculus,  the  following  lemma 
allows  us  to  reverse  weakening  through  strengthening  of  assumptions  and  weak¬ 
ening  of  the  guarantees.  It  will  be  a  crucial  proof-theoretic  tool. 
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Lemma  5.6  (Weakest  precondition  and  strongest  postcondition  of  7Z) 
Let  TZ  he  Si  refinement 

=  [P,T]  C^vC  [Q,A] 

that  was  derived  using  the  calculus.  Then  there  exist  sets  of  predicates  wp{Jl) 
and  sp(7J)  such  that 

•  wpijt)  C  r  and  P  =>  l\wp{Tl),  and 

♦  sp{R)  C  r  and  f\sp[Tl)  =>  Q,  and 

.  [AiypW.r]  CyvC  [/\sp{n),A]. 

Proof;  The  proof  proceeds  by  structural  induction  over  the  derivation  of 

[P,T]  CyvC  [Q,A] 

and  can  be  found  in  Section  A. 2.4  on  page  249.  ■ 


5.5  General  refinement  methodology 

Let  Cl  be  a  high-level  specification  of  the  implementation  that  is  to  be  derived. 
Cl  can  be  viewed  as  an  abstract  statement  of  the  computation  to  be  performed. 
More  precisely,  Ci  defines  the  executions  that  all  refinements,  and  thus  also  the 
final  implementation,  are  allowed  to  exhibit.  In  Chapter  6.3,  for  instance,  Ci  is 

Cl  =  x:[itytty ;  {max{x)} 

where  the  predicate  max(x)  is  true  if  and  only  if  x  is  larger  or  equal  to  all 
entries  of  some  array  A,  Refinements  of  Ci  are  thus  required  to  only  change  x  a 
finite  number  of  times  before  they  terminate  in  a  state  in  which  x  contains  the 
maximum  of  array  A,  The  refinement  of  Ci  then  proceeds  by  finding  a  sequence 
of  programs  C2,  •  • . ,  Cn  such  that 

[P.Ti]  CiyCi+i  [Q.A.-] 

for  1  <  z  <  n.  Typically,  a  single  refinement  step  would 

•  introduce  either  local  variables,  local  channels,  loops  or  parallel  compo¬ 
sitions.  The  first  refinement  in  Chapter  6.3,  for  instance,  introduces  the 
local  boolean  variables  m[l]  through  m[n]  and  the  finite  loop 

{m[l], . .  .,m[n]}:[tf,  VI  <  i  <  n.Pi]* ;  {/} 

where  Pi  specifies  that  each  m[z]  cannot  be  set  once  it  has  been  reset  and 
/  expresses  that  m[i]  is  set  if  and  only  if  A[z]  contains  the  maximum  of 
A.  This  program  says  that  each  m[i]  can  only  be  changed  in  such  a  way 
that  P[i]  holds  at  the  end  of  each  transition.  Moreover,  I  must  hold  upon 
termination  of  the  loop.  Given  /  and  array  m  it  is  then  straightforward 
to  determine  the  maximum  of  A. 
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•  or  replace  an  abstract  statement  by  a  more  concrete  one.  The  next  to  the 
last  refinement  in  Chapter  6.3,  for  instance,  replaces 

m\i]:[tt,Pi  A  Uj] 

by 

if  A[t]  <  A[j]  then  m[i]  -.nff 

where  /,  j  requires  m[i]  to  be  set  if  A\i\  is  the  maximum  and  m[t]  to  be 
reset  if  A\j]  is  greater  than  A[i]. 

With  transitivity  (Lemma  5.3)  the  above  sequence  of  refinements  then  implies 
[P,\JiTi]  CiyCn  [Q,aA,] 

which  yields 

{P};C^  2ei  {P};Cn 


{P}  Cn  {Qn] 

with  weakening  and  Lemma  5.5.3.  Thus,  every  execution  a  of  Cn  that  starts 
in  a  state  satisfying  P  also  is  an  execution  of  Ci  and  whenever  a  is  finite, 
the  last  state  satisfies  Qn-  Note  that  the  refinement  process  typically  is  not 
deterministic,  that  is,  at  each  stage  Cj  in  the  refinement  process  several  rules 
may  be  applicable  each  leading  to  a  different  refinement  Refinement  thus 

gives  rise  to  a  tree  rather  than  a  linear  sequence.  It  is  the  task  of  the  user  to  find 
the  path  through  the  tree  that  leads  to  the  desired  result.  Figure  5.2  depicts  the 
refinement  process.  Also  note  that  the  refinement  methodology  assumes  that 
all  Ci  have  a  non-empty  set  of  executions.  Care  must  thus  be  taken  to  ensure 
that  both  the  initial  program  and  all  of  its  refinements  have  a  non-empty  set  of 
executions. 

Remember,  however,  that  according  to  Proposition  2.2  all  programs  that 
contain  the  standard  programming  language  constructs  only,  do  have  non-empty 
sets  of  executions.  Consequently,  whenever  the  most  refined  program  Cn  is  syn¬ 
tactically  well-formed  in  the  sense  of  Proposition  2.2,  then  the  entire  refinement 
is  non-trivial. 

5.5.1  Notation 

1.  A  refinement  statement  may  be  written  as 

=  [P,T]  Cl  y-v  C2  [Qj  A]  justification 

=  I^,r] 

Cl  yv  C2 


or  as 
7^ 


justification 
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Co 


Specification 


more 

detail 


Implementation 


Figure  5.2:  General  shape  of  the  refinement  process 


or  as 

n  =  [p,r] 

Cl 

>~V  justification 

C2 

depending  on  the  size  of  Ci  and  C2  where  justification  is  a  list  consisting 
“r *  5  1  ^  reference  to  a  lemma  or  proposition  or  a  refinement  rule. 

The  name  Tl  and  the  justification  may  be  omitted, 

2.  Very  often,  a  single  refinement  does  not  suffice  to  derive  the  desired  re¬ 
sult  and  a  sequence  of  refinements  is  needed.  Sometimes  all  refinements 
in  that  sequence  hold  under  the  same  assumptions  and  guarantees.  To 
express  the  situation  concisely,  we  introduce  the  following  notation.  Let 
Cl  through  Cn  be  programs  with  a  non-decreasing  set  of  free  variables, 
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that  is,  fv{Ci)  C  fv(Ci.^i)  for  all  1  <  i  <  n  —  1.  Then, 

Tz  =  [p,r] 

Cl 

C2 

yv^ 


justificatiorii 

justification2 


Cn-l 

yv^-i 

Cn 

[Q.A] 

abbreviates 

[P,T]  Qyv.Ci+i  [«,A] 
for  all  1  <  i  <  n  —  2  and 


justificatiorin-i 


justificatiorii 


[p,r]  Cn-iyv„_,c„  [Q,A].  justificatiorin-i 

Note  that  by  using  transitivity  n—1  times,  the  refinements  above  imply 

[P,  r]  Cl  yy  Cn  [Q,A]  Lemma  5.3 

where  V  =  Ui  K’-  Consider,  for  example,  the  derivation  of  a  program 
that  swaps  the  values  of  two  variables  x  and  y. 


[x^mAy=^nj{x=:m,y^njX  =  n,y  =  m^tmp  =  m}] 
{x,y]:[tt,tt\* 


Lemma  3.4.1 

x,y}:[tt,tt] 

>- 

(Lemma  2.2.5), 

Lemma  3.4.1 

skip  ;  {x}:[ttyit] ;  {y}'^[ttyii] 

p.^4; (Lemma  2.2.5), 

Lemma  3.4.1 

skip  ;  X  :=n  ;y:=m 

y  {tmp} 

ATOM,  SEQ 

implex  ;  X  :=y  ;  y:=imp 
[x  =  n  Ay  =  m,  Pr€ds(  Var\{tmp,  x,  y})] . 

This  sequence  implies 

^x  =  m  Ay  =  n^{x  =  m^y  —  rijX  ^  n^y  ~  m,  imp  —  m}] 
{x,y}:[tt,tt]* 

y  {tmp] 

tmp:=x  ;  x  :=y  ;  y:=tmp 
[x  =  n  Ay  =  Preds{  Var\{^ ,  x^  y})] . 
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5.6  Fine-grained  concurrency 

We  revisit  the  extension  to  finer  levels  of  granularity  of  Section  2.4  and  discuss 
how  it  meshes  with  refinement. 

On  the  one  hand,  finer-grained  concurrency  gives  the  specifier  more  finer- 
grained  control  over,  for  instance,  evaluation  strategies.  However,  during  the 
initial,  more  high  level  phases  of  the  development  process  we  typically  want 
to  abstract  from  these  kinds  of  low-level  detail.  What  counts  is,  for  instance, 
that  if  some  boolean  expression  B  holds,  then  Ci  must  be  executed  to  maintain 
some  invariant,  and  otherwise  C2.  We  want  to  get  the  logic  of  the  program 
right  without  having  to  worry  about  how  precisely  B  will  be  evaluated.  This 
gives  us  the  desired  abstraction,  but  also  makes  the  development  process  more 
portable.  The  more  the  introduction  of  low-level,  machine-dependent  aspects 
is  postponed,  the  more  will  the  initial  phases  of  the  design  be  adaptable  to 
different  machines  or  architectures. 

However,  this  delay  typically  comes  at  a  price.  As  demonstrated  in  Sec¬ 
tion  2.4  the  replacement  of  atomic  by  non- atomic  expressions,  for  instance, 
can  introduce  a  lot  of  surprising,  unwanted  behaviour.  Additional  assumptions 
must  be  placed  on  the  environment  to  ensure  soundness.  In  this  section,  we 
show  that  the  refinement  calculus  meshes  very  well  with  finer  levels  of  granu¬ 
larity.  Sufficient  conditions  can  be  found  which  allow  the  replacement  of  atomic 
boolean  expressions  by  non-atomic  ones.  Moreover,  the  calculus  clearly  shows 
how  different  evaluation  strategies  require  different  environment  assumptions. 

Recall  that  the  boolean,  binary  operations  A/r ,  Ar/,  and  Ap  compute  conjunc¬ 
tions  by  evaluating  their  arguments  either  from  left-to-right,  from  right- to-left, 
or  in  parallel.  Let  x  and  y  be  two  boolean  variables  and  Cx  and  C2  be  two 
programs  and  suppose  we  want  to  use  the  rule  COND  to  refine 

C  =  ({a:  AJ/}  ;C'i)  V  ({--(x  A2/)}  ;C'2) 

into  a  conditional 

C"  =  if  X  op  y  then  C[  else 

where  op  stands  for  one  the  three  conjunction  operations  and  Ci  is  refined  into 
Cl  under  assumptions  Fi  and  guarantees  Ai  and  similarly  for  C2  and  r2  and 
A2.  Depending  on  which  evaluation  strategy  we  choose,  we  get  different  minimal 
assumptions. 

If  variable  x  is  evaluated  first,  then  its  value  must  be  preserved,  that  is,  the 
value  of  X  must  not  change. 

[<<,{x,-nx}uri  ur2] 

({x  Ay}  ;Ci)  V  ({-’(x  A  y)}  ; C2) 

if  X  Air  y  then  €[  else 
[«,AinA2] 


COND 


5.7.  DISCUSSION 


85 


If,  however,  variable  y  is  evaluated  first,  then  the  value  of  y  must  not  change. 

[«,{j/,--j/}uri  ur2] 

({x  A  y}  ;  Cl)  V  A  y)}  ;  C2) 

>-  COIMD 

if  X  Ari  y  then  C[  else  C2 

[«,AinA2] 

Finally,  if  both  arguments  are  evaluated  in  parallel,  neither  x  not  y  can  be 
allowed  to  change. 

[it,  {x,-^x,  y,  -ij/}  U  Fi  U  r2] 

({x  A  y]  ;  Cl)  V  ({--(x  A  2/)}  ;  C2) 

COND 

if  X  Ap  y  then  €[  else  C2 
[it,  A1UA2] 

Note  that  the  contexts  given  by  each  of  the  first  two  refinements  are  more 
discriminating  than  the  context  given  by  the  last  refinement.  In  other  words, 
the  restrictions  placed  on  the  last  context  subsume  the  restrictions  placed  on 
each  of  the  first  two  contexts.  Consequently,  all  three  refinements  hold  under  the 
assumptions  {x,~^x^y,-^y}.  We  see  that  different  evaluation  strategies  require 
different  assumptions.  The  trace  semantics  seems  well  suited  to  capture  fine¬ 
grained  evaluation  strategies.  The  refinement  calculus  not  only  supports  the 
refinement  of  atomic  expressions  into  non- atomic  expressions,  but  also  allows 
for  the  comparison  of  different  evaluation  strategies. 

Note  that  finer-grained  parallelism  does  not  always  require  additional  as¬ 
sumptions.  Consider  for  instance,  the  refinement 

[P, r]  x:=eyx:=v  [Q,A] 

in  the  coarse-grained  setting.  Typically,  the  validity  of  this  refinement  depends 
on  the  variables  in  expression  e  carrying  certain  values.  Thus,  these  variables 
need  to  be  protected  from  interference  before  execution  of  the  assignment  and 
the  appropriate  assumptions  need  to  be  placed  in  F.  The  somewhat  surprising 
point  is  that  this  refinement  would  continue  to  hold  if  the  atomicity  assump¬ 
tion  on  expression  evaluation  is  dropped.  From  a  reasoning  point  of  view,  in¬ 
terference  during  the  execution  of  assignment  is  often  just  as  detrimental  as 
interference  right  before  or  after  the  execution  of  the  assignment. 

5.7  Discussion 

Before  we  illustrate  the  use  of  the  calculus  in  the  following  chapters,  we  briefly 
summarize  the  advantages  and  disadvantages  of  the  refinement  calculus  pre¬ 
sented  in  this  chapter.  The  refinement  calculus 
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•  supports  stepwise,  top-down  program  development, 

•  is  context-sensitive, 

•  supports  the  introduction  of  local  variables  and  channels, 

•  treats  shared- variable  and  message-passing  concurrency  uniformly, 

•  supports  fine-grained  concurrency, 

•  is  based  on  a  powerful,  fully  abstract  semantics. 

We  will  see  in  the  following  chapters  to  what  extend  compositional  reasoning 
and  reasoning  about  liveness  properties  is  supported.  However,  the  refinement 
calculus  also 

•  only  supports  safety  properties  as  assumptions.  Liveness  properties,  for 
instance,  are  not  allowed, 

•  currently  lacks  a  completeness  result, 

•  contains  rules  whose  precise  shape  is  hard  to  justify.  More  precisely,  while 
the  given  introduction  rules  will  allow  the  derivation  of  a  number  of  al¬ 
gorithms,  a  number  of  alternative  introduction  rules  could  be  given.  At 
the  moment,  we  lack  a  formal  justification  for  the  choice  and  shape  of 
the  introduction  rules.  The  variety  of  examples,  however,  gives  a  strong 
empirical  indication  that  the  rules  provide  a  good  starting  point. 

5.7.1  Alternative  definitions  of  refinement 

It  is  instructive  to  review  alternative  definitions  of  the  refinement  relation.  Be¬ 
low,  two  alternative  definitions  for  refinement  relation  >-  will  be  given. 

Definition  5.2  (Alternative  definitions  of  y) 

1.  Let 

[/>,r]  Cy'yC  [Q,A] 

be  defined  as  the  refinement  relation  >-  in  Definition  5.1,  except  that  the 
fourth  clause  is  replaced  by 

4.  we  have 


C  >E  C', 

for  all  contexts  E  of  the  form 

E  =  new  =  ui , . . . ,  x„  =  Un  in  {P}  ;  [□  ||  pre'^T] 


where  x,-  €  Doruxi  for  all  1  <  i  <  n. 
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2.  Let 

C  Drt  C'  {PJ,x) 

abbreviate  that  for  all  a  E  T^|C^]  such  that  a  \=  assump{P,T)j  and 
{x  =  v)a  E  T^|C"|  for  some  v  E  Doin^,  there  exists  /?  E  such  that 

P  1=  assump{P^T)^  and  {x  —  v)P  E  and  a\x  =  P\x.  Given  a  set 

of  variables  let  C  Dj-t  C'  (P,  F,  V^)  be  the  obvious  generalization.  Let 

[PJ]  [g,A] 

be  defined  as  the  refinement  relation  in  Definition  5.1,  except  that  the 
fourth  clause  is  replaced  by 


4.  we  have  C  D^t  C"  (P,  F,  V). 


□ 


The  definition  of  y'  is  very  close  to  Definition  5.1.  It  avoids  the  use  of  the 
modulo  notation  {mod  V)  by  hiding  the  changes  to  the  variables  in  V  in  a  local 
variable  declaration.  However,  it  also  forces  the  explicit  initialization  of  the 
variables  in  V  and  thus  gives  rise  to  an  awkward  quantification  over  all  possible 
initial  values  of  the  local  variables.  The  relations  and  are  equivalent. 

Proposition  5.1  (Equivalence  of  y'  and 
Refinement 

[P,F]  [Q,A] 

is  valid  if  and  only  if 

[P,r]  Cy'vC  [Q,A] 

is  valid. 


Proof:  Let  E  =  {P}  ;  [[]  ||  pre^T].  We  have  to  show  C  >e  C  {mod  17)  iff 
C  >E>  for  all  contexts  E'  of  the  form 

E'  ~  new  Xi  =  Vn,  ^  ^  •,Xn  Vn  in  E 

where  Vi  E  Dom^-  for  all  I  <  i  <  n.  C  >e  C'  {mod  V)  is  short  for  E[{C)] 
E[{C^)]  {mod  y).  By  Lemma  4.3,  this  execution  inclusion  modulo  V  is  equiva¬ 
lent  to  P'[(C)]  Dft  P'[((7')]  which  is  is  equivalent  to  C  >e>  C".  ■ 


There  is  a  subtle  difference  between  y^^  and  Note  that  the  initial  state  of  a 
trace  of  C'  in  the  definition  of  y"  satisfies  the  precondition  P  without  the  use  of 
environment  assumptions.  In  the  definition  of  (or  y^),  however,  assumptions 
are  needed  to  guarantee  preservation  of  the  precondition.  Consequently,  the  two 
relations  are  not  equivalent.  For  instance, 

[ar  ~  3,  Preds(0)]  y:=A  y:-x  -f  1  [tt^  Preds(0)]  (5.1) 

is  valid  whereas 

[a?  =  3,  Pre<is(0)]  y:=4  y  y:-x  1  [ti,  Preds{9)] 

is  not.  If,  however,  the  assumptions  are  such  that  the  precondition  is  always 
preserved,  the  two  relations  coincide. 
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Proposition  5.2  (Relationship  between  y"  and  >-) 
Let  r  C  r.  Then, 

[Ar',r]  cyvC  [q,a] 

if  and  only  if 

[Ar',r]  cy'{.c'  [q,a]. 


Proof:  The  assumption  F'  C  F  ensures  the  preservation  of  the  precondition 
/\  F'.  Thus,  the  difference  between  the  two  definitions  vanishes.  ■ 

Let  C  denote  the  refinement  calculus  of  Section  5.4.  Calculus  C"  arises  from  C 
by  replacing  every  occurrence  of  y  by  x".  Note  that  all  rules  remain  sound.  The 
Rule  ASSCOM  ensures  that  a  precondition  used  to  show  refinement  between  two 
atomic  statements,  always  occurs  in  the  assumptions.  Consequently,  the  above 
counterexample  (5.1)  is  not  derivable  using  C''  and  difference  between  the  two 
relations  again  disappears. 


Proposition  5*3  (Relationship  between  and  >-) 
1.  All  rules  in  C"  are  sound. 


2.  Refinement 

[P,T]  CyvC  [Q,A] 
is  derivable  using  C  if  and  only  if 

[P,F]  Cy'^C'  [g,A] 

is  derivable  using 

Proof:  2)  Lemma  5.6  also  holds  in  C".  The  proof  is  identical.  The  desired 
result  follows  directly  from  both  versions  of  this  lemma.  ■ 


The  definition  of  x"  is  interesting,  because  it  avoids  the  use  context-sensitive 
approximation  and  thus  of  labels.  The  standard,  unlabeled  transition  traces 
suffice  for  this  definition.  In  other  words,  the  calculus,  as  presented  so  far, 
could  also  be  defined  without  labels.  However,  in  Chapter  8  context-sensitive 
approximation  and  labels  will  be  crucial  for  the  specification  and  proof  of  certain 
properties  and  transformations. 
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5.8  Summary  of  the  refinement  rules 


Figure  5.3:  Assumption-commitment  rules 


Figure  5.4:  Basic  refinement  rules 
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OR 

[P,ri]  Cl  ^v,  C[  [Q,Ai]  [P,T2]  C2  yy,  C'2  [Q,A2] 
[P,riur2]  CiVC2yv,uv,Civc'2  [Q.AiDAz] 

where  wf{C\,V2)  and  wf(C2,Vi). 

STAR 

[7,r]  CyyC  [7,  A] 

[7,r]  C*^v  {C'Y  [7,  A] 

OMEGA 

[7,r]  C>-vC'  [7,  A] 

[7,r]  C“^v(C'r  [^A] 

PAR 

[Pi.Ti]  CiW.  Cl  [Qi,Ai]  [Pa.Fz]  C2  [Q2,A2] 

[Pi  A  P2,ri  u  r2]  Cl  11 C2  vvlUVj  Cl  II  <?2  [Qi  A  Q2,  Ai  n  A2] 

where  Fi  C  A2  and  r2  C  Ai  and  wf{Ci,  V2)  and  wf{C2,  Vi) 

NEW 


_ [P,T]  Cyy  C  [Q,A]  x^V _ 

[P[v/x],V']  new  x  =  u  in  C  >~v  new  x  =  vmC  [3x.Q,  A^ 

where 

r'  =  {P[t;/x]  I  P  G  r  A  V  G  Dorux} 

A'  =  {P,  7^[t)/x]  I  P[v/x]  G  A  A  V  G  Donix}. 

WEAK 


[P'X]  C[^v>C^  [Q',A'] 

[p,r]  Ci^vCz  [g,A] 

where  C[  =rt  Ci,  C2  Crt  C^,  P  =i>  P',  Q'  =>  Q,  T'  C  T,  A  C  A',  and 
V  C  V. 


Figure  5.5:  Basic  refinement  rules  (continued) 
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COND 
If 

1.  [PAS,ri]  Cl  Ci  [Q,Ai],and 

2.  [PA-S,r2]  C^yv.C'^  [Q,  As],  and 

3.  P  ^  B'),  and 

4.  wf{Ci,V2)  and  wf{C2,Vi), 
then 

[p,riur2U{p,5',-5'] 

if  B  then  Ci  else  C2  )^ViuV2  if  then  C[  else 

[Qy  Ai  n  A2]. 

FOR 

If 

1.  i  is  an  integer  variable  that  is  never  assigned  to  in  C  or  C",  and 

2.  [P[k  -  l/il  Ti]  C[k/i]  yv  C'[k/i]  [P[k/il  A,-]  for  all  l<k< 

then 

[P[VW=i^i] 

for  i  =  1  to  n  do  C  yv  for  i  =  1  to  n  do 

[P[n/i],f]U^i]- 

WHILE 

[/AS,r]  C^vC  [/,A]  fviB)nV  =  Q 

[I,TU{B',^B']]  while  B  do  C  while  B' do  C'  [/A-.B',A] 


Figure  5.6:  Derived  refinement  rules 
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PAR-N  If 

1.  [Pi,r.]  CiXv,  C'i  [(?.-,  A.],  and 

2-  r.-  C  and 

3.  wf{Cj,  Vi)  for  all  1  <  j  <  n  with  j  ^  i, 

for  all  1  <  i  <  n,  then 


[A?=i^.>uLir.]  [Ar=i<?^nr=iA.]- 


PAR-V 

[i^i.ri]  Ci^Cj  [Qi.Ai]  [Pz.Tz]  CjyC^  [Q2,A2] 
[PiAP2,riUr2]  Cy\\C2>C[\\C'^  [Qi  AQ2,AinA2] 

where  Fi  C  A2  and  r2  C  Aj. 

PAR-V-N  If 

1.  [P,',r,]  Ci^Ci  [Q.-,A,],and 
2-  Ti  C  n"=ij^.A,- 

for  all  1  <  2  <  n,  then 

[Ar=i^.-.ur=ir.]  iiLi  c.  ^  iir=ic;  [Ar=iQ.-.nr=iA.]. 


Figure  5.7:  Derived  refinement  rules  (continued) 
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PAR-INTRO 

If 

1.  C  is  robust  and  preserves  P|-  Aj  in  all  contexts,  and 

2.  [P,  Fj]  C  >-  Ci  [Qi,  A,]  for  all  I  <i  <n,  and 

3.  Fj  C  Aj  for  all  1  <  i  <  n,  and 

4.  (VI  <  i  <  n.Qi)  ^  Q, 

then 

[P,uri]  C*-,{Q]y\\^^,Ci  [Q,aAi]. 

WHILE-INTRO 

If 

L  [5A/,r]  C>-a  [/,A]and 

2.  (“iB  A  /)  ^  Q  and 

3.  there  exists  an  arithmetic  expression  m  over  the  free  variables  in  B 

and  C"  such  that  m  >  0,  and  m  =  0  ‘->B,  and 

C'  {inv*m  ;  Am  ;  inv*m)~^ 

then 

[/,FUF„,U{Q}]  ({SA/};C)*;{Q}  ^  while  P  do  C'  [Q,A] 

where  ^  ^ 

Am  =  Var:[it,  m=  0  ^  m  =  0  |  m  <m] 

Fm  =  <  n  I  n  G  N}. 

FOR-INTRO 

If 

1.  [I[k/i],  F;-]  C  y  C'[k/i]  [I[k  +  1/i],  A/e]  for  all  0  <  /?  <  n  —  1  and 

2.  I[n/i]  =>  Q  and 

3.  i  is  a  constant, 

then 

_ [/[OAlULir,]  C*;{Q}^for^•=ltondoC'  [Q,nr=iAi]. 

Figure  5.8:  Introduction  rules 
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NEW- INTRO 

[P,r]  [(?,A] 

[P[vlx],  C  >-v  new  x  =  v  inC  [3a;.Q, 

where 

r'  =  {P[v/x]  I  P  G  r  A  t;  G  Dorrix} 

A'  =  {P,Plv/x]  I  P[v/x]  €  A  Av  €  Dorrix}. 

AWAIT-INTRO 

If 

1.  [Pi,  r]  [V-.[B  A  P2,  Qa]  II  D]  [Qi,  A]  and 

2.  there  exists  an  arithmetic  expression  m  over  the  free  variables  in  B 
and  D  such  that  m  >  0,  and  m  =  0  ^  B,  and 

D  Cjx  (inv*m  ;  AmY  ^  iinv*m  ;  Am  ;  inv*m)* ;  {m  =  0}  ;  inv*m 

then 

[Pi,rur„,] 

[V:[B  A  Pa,  Qa]  ||  D]  y  [await  B  then  F:[Pa,  Q2]  end  ||  D] 
[Qi,A] 

where  =  {m  <  n  |  n  G  N}. 


Figure  5.9:  Introduction  rules  (continued) 


Chapter  6 

Developing  shared-variable 
parallel  programs 


This  chapter  illustrates  the  use  of  the  calculus  for  the  development  of  shared- 
variable  parallel  programs.  Four  examples  are  given.  Section  6.1  contains  a 
simple  example  to  illustrate  the  basic  use  of  the  calculus.  Section  6.2  derives  a 
shared- variable  parallel  implementation  of  the  Warshall-algorithm.  Section  6.3 
derives  a  shared- variable  parallel  program  to  find  the  maximum  in  an  array  of 
integers.  The  derived  implementation  features  nested  parallelism.  Alternative 
derivations  are  discussed.  Section  6.4  treats  the  generalization  of  the  maximum 
search  problem:  the  first  element  in  an  array  that  satisfies  a  property  is  to 
be  found.  We  derive  a  shared- variable  parallel  program  and  show  how  further 
refinement  can  lead  to  more  efficiency. 


6.1  Example:  Bank  accounts 

The  following  example  has  also  been  used  in  [AS85,  XJ91,  Din99b].  Suppose 
n  >  1  bank  accounts  are  represented  by  an  array  A[l..n].  Let  the  constants  a 
and  b  with  1  <  a,  6  <  n  and  a  ^  b  denote  two  accounts.  We  want  to  develop  a 
program  which  computes  the  sum  s  over  all  entries  in  A  and  concurrently  also 
transfers  $20  from  account  a  to  account  b.  We  start  with  a  high-level  program 
Cl  that  is  easily  seen  to  be  correct.  Let  Ci  be 

Cl  =  [s:=E^||  A[a],^[6]:=A[a]-20,^[6]  +  20] 

where  EA  stands  for  EA  =  E^_iA[z].  The  summation  of  an  array  is  not  imple- 
mentable  in  a  single  atomic  step.  Program  Ci  thus  needs  to  be  refined.  The 
entire  refinement  is  summarized  in  Figure  6.1. 
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Cl  =  [s:=S^  II  >l[a],>l[6]:=>l[o]-20,>l[6]  +  20] 


C2  =  new  k  =  1 ,  <  =  0  in 


{i  =  EA}-, 

A[a],A[b]  :^A[a]  -  20,  yl[6]  +  20 

s:=EA 

C3  = 


new  fc  =  1,<  =  0  in 
■ 

{t  =  EA}-, 
s:=t 


A[a],A[b]  :=^[a]  -  20,  A[b]  +  20 


C4  =  new  k  =  l,t  =  0  in 


({k  <  n  A  /}  ; 

«• 

{t  =  E^}; 

A[a],A[b]  :=^[a]  -  20,  A[b]  +  20 

Slszi 

/=  k  —  I  <  n  At  = 

C5  =  new  k  =  l,i  =  0  in 


while  A:  <  n  do 

t  iz:t  -j-  .;4[/?]  5  /?  :=  A?  -(“  1 

{A[a],>l[6]}:[P,Q] 

od; 

s:=t 

P  =  (k<aAk<b)\/{k>aAk>b) 
Q  =  24[a]  =4«] --20  A  ^[6]  =:i[6] +20. 

Cq  =  new  /:  =  1,  <  =  0  in 


while  A:  <  n  do 

await  P  then 

t:-t  +  A[k] ;  k:=k  +  1 

^a] ,  ^6]  :=yl[a] -- 20, 46]  +  20 

od; 

end 

s:-t 

Figure  6.1:  Derivation  of  a  solution  to  the  bank  problem 


6.L  EXAMPLE:  BANK  ACCOUNTS 


97 


Refining  Ci  into  C2 

We  will  compute  T,A  in  the  standard  way  using  a  loop  that  steps  over  A  and 
keeps  the  partial  sum  of  elements  seen  so  far.  To  this  end,  we  first  introduce 
two  local  variables  k  and  t  and  a  finite  loop  that  modifies  these  two  variables 
only  and  that  is  required  to  terminate  in  a  state  in  which  t  contains  the  sum 
over  A. 

Let  Qs  be  the  postcondition  of  the  left  parallel  subprogram  and  let  Pat  and 
Qab  be  the  pre-  and  post-condition  of  the  right  parallel  subprogram,  that  is, 

Qs  =  s  =  I:A  Pab  =  A[a]  =  vi  A  A[h]  -  V2 

Qab  =  A[a]  =  vi  -20  A  A[b]  =  V2  A  20 

where  vi,V2  are  integers.  Formally,  this  refinement  is  based  on 

7^l  =  [«,{Q.}] 

s:=EA 

X  =^4; (Lemma  2.1),  Lemma  5.4 

skip*  ;  skip  ; 

y{k,t}  ATOM,  SEQ,  STAR 

;  {i  T,A]  ;  s:=T,A 
\Qs->  {Pab-)  Qab}]- 

We  now  derive  an  assumption-commitment  formula  for  the  right  subprogram. 
We  need  to  show  that  it  terminates  in  the  desired  state  and  that  it  preserves  at 
least  the  assumptions  of  the  left  subprogram,  that  is,  the  predicate  Qs. 

^.2  =  [Pab,{Pab)Qab]] 

A[a],A[b]  :=A[a]  -  20,  ^[6]  +  20  atom 

[Qabj 

where  A  =  {Qs,t  =  EA}  U  Preds{{k,t^  s})  and  Preds{V)  denotes  the  set  of 
all  predicates  over  the  variables  in  V .  In  other  words,  the  multiple  assignment 
statement  preserves  at  least  Qg,  t  =  EA,  and  all  predicates  over  k,  t,  and  s.  It 
also  preserves  a  lot  of  other  predicates,  but  for  the  sake  of  simplicity  we  only 
mention  the  necessary  ones. 

To  derive  the  desired  refinement  between  Ci  and  C2 ,  we  put  the  refinements 
1Z\  and  IZ2  in  parallel  and  then  declare  the  variables  k  and  i.  Formally, 


[Pabi  {Pabi  Qab,  Q^}] 

Cl  yC2 

[Qab  A  Q5,Preds(0)]. 


PAR-V(7^1,7^2).  MEW-INTRO(;j.f) 
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Refining  C2  into  C3 

If  the  predicate  t  =  EA  is  preserved  by  the  environment,  then  the  abstract 
assignment  s:-EA  can  safely  be  replaced  by  s:-t.  Formally,  we  have 

7^3  =  =  ^Ay  {Qajt  =  Syl}] 

s:=EA  y  s:=t  atom 

{P b6}  Qat}]  • 

Refinement  between  C2  and  C3  follows  in  a  syntax-directed  fashion. 

72-4  r=  [Pab)  {Pab^  Qabi  Q»]  U  {S.4  =  V  |  V  E  N}] 

C2  y  Cz  SEQ,  PAR(tc2),  m\N{k,t) 

[Qab  AQs,Pr€ds{9)] 

is  determined  by  the  structure  of  C2  and  Cz  in  a  straight-forward,  syntax- 
directed  fashion  and  thus  omitted.  Note  that  according  to  7^2 ,  the  right  parallel 
subprogram  preserves  the  predicates  and  i  =  Si4  as  required  by  TJa.  More 
precisely,  {Qs^t  =  EA}  C  A.  Moreover,  note  how  the  application  of  NEW 
simplifies  the  assumptions.  The  requirement  that  t  =  HA  is  preserved,  which 
stems  from  7^3,  is  replaced  in  7^4  by  the  requirement  that  the  sum  over  Ay  HA, 
is  left  unchanged. 

Refining  Cz  into  C4 

We  now  equip  the  loop  in  Cz  with  a  termination  condition  B  =  k  <  n  and  an 
invariant  /  =  A:  —  l<nA^  =  This  requires  replacing  {kyi}:[ttytt]*  by 

{{B  A  /}  ;  {kyt}:[itytt]y.  Formally,  we  show 

{kyt}:[ttyU]* 

—q-t  (skip  ;  Lemma  2.1 

({^  £  ^  A  /}  ;  {kyt}:[ityUY)* .  Lemma  2.2 

using  equivalence  under  finite  stuttering  and  congruence.  Congruence  then  also 
implies  72$  =  C3  Dq-x  C4  which  in  turn  yields  the  desired  refinement  formula 

[Pab,  {Pab,Qab,Qs}  U  =  V  |  i;  €  N}] 

C2  y  C4  Lemma  5.4(7^4,725) 

[Qab  A  QsyPr€ds{ii^)] 

by  weakening  and  the  previous  refinement  7^4. 

Refining  C4  into  C5 

We  will  now  replace  the  finite  loop  on  the  left  by  a  while  loop.  However,  for 
the  while  loop  to  correctly  store  the  sum  over  A  in  t  upon  termination,  the 
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states  in  which  the  right  subprogram  can  update  A  must  be  restricted.  This 
refinement  step  modifies  the  two  parallel  subprograms  in  C4  simultaneously. 
Left  subprogram.:  We  want  to  use  rule  WHILE-INTRO  to  replace  the  finite  loop 
by  a  while  loop.  The  rule  has  three  premises. 


1.  First,  we  must  prove  that  I  is  indeed  a  loop  invariant.  The  loop  body  is 
refined  at  the  same  time.  More  precisely,  we  show 


[k  <  n  A  C{k  <  n,  /}] 

{k,i}:[ti,ti]* 

y  Lemma  3.4 

{k^i}:[tt,ii] ;  {k,i}:[tt^ti] 

y  ATOM,  SEQ 

t:=t  +  A[k] ;  k:=k  +  1 

[C  {Pab.Qab}]^ 


2.  Next,  we  argue  that  t  =  TiA  holds  upon  termination  of  the  loop,  that  is, 
k>nAl^t  =  EA. 

3.  Moreover,  we  need  to  find  an  arithmetic  expression  mi  that  allows  us  to 
prove  termination  of  the  while  loop.  Let 

mi  =  max{n I  —  k,Q). 

We  check  each  of  the  three  conditions  on  mi .  Clearly,  mi  always  is  nonneg¬ 
ative  and  mi  =  0  implies  violation  of  the  loop  condition,  that  is,  mi  >  0 
and  mi  =  0  k  >  n.  Moreover,  the  loop  body  decreases  mi,  because 
k:=k  1  does  and  tizzt  A[k]  leaves  mi  unchanged.  Formally, 

{inv*mi  ;Ami  ;  mt^*mi)+ 

Dj-t  inv  mi  ;  .4mi  clef:  c+,  c* 

Dq-t  t:=i  A  A[k]  ;  k:=:k 1.  Lemma  2.2. 


Thus, 


TZq  —  [/,  FUF^t^j] 

{{k  <  71  A  /}  ;  {/?,  ft])*; 

{t  =  ET}; 

s:-t 

y 

while  k  <n  do 
t  :=t  -h  A[k] ;  k  :=k  -1-  1 
od; 
s  :=t 

\Qs^  {Pabi  Oafe}] 


WHILE-!NTRO(l,2,3),  SEQ 
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where  F  =  {Ar  <  n,  Q,,  =  EA}  and  Tmi  =  {^i  <  n  |  n  €  N}. 

Right  subprogram:  The  above  refinement  is  subject  to  the  constraints  F  and 
Tmi’  However,  in  its  current  form  the  right  subprogram  does  not  meet  these 
constraints.  In  particular,  it  does  not  preserve  the  invariant  /.  The  transferred 
money  may  be  counted  twice:  once  on  account  a  and  again  on  b.  The  solution 
is  to  restrict  the  transitions  of  the  interfering  component  such  that  it  cannot 
disturb  the  computation  of  the  other.  This  is  achieved  by  postulating  that  the 
transition  which  transfers  $20  from  account  a  to  account  b  preserves  the  value 
of  and  thus  the  predicate  t  =  for  all  values  of  fc.  Let 

Q  =  A[a]  =A[a] -20  A  A[b]=j^b]A20 
R  =  4]  . 


Then, 


72-7  =  [-Padj  {-Fa6)  Qab}] 

A[a],A[b]  :z:A[a]  -  20,  A[b]  +  20 

y  ATOM 

{A[alA[b]y{tt,QAR] 

\Qaby 

where  A  =  {Qs ^  EA)  U Pr€ds{{k^ «}).  We  refine  this  further  by  restrict¬ 
ing  the  transfer  to  states  in  which  either  k  <  a  and  k  <  b,  or  k  >  a  and  k  >  b. 
Let  P  =  {k<aAk<b)\/{k>aAk>b),  Then, 

TZg  =  [Pab)  {Pabi  Qab}^ 

{A[alA[b]}itt,  QAR]  y  {A[a],A[b]}iP,  Q]  atom 
\Qaby  • 

The  above  two  refinements  imply  with  transitivity 

72.9  =  l^P ab)  {P abi  Qab}] 

A[a],A[b]  :^A[a]  -  20,  A[b]  -f  20 

Lemma  5.3(W7,')«8) 

{A[alA[bmP,Q] 

[Qab,A]. 

This  concludes  the  refinement  of  the  right  parallel  component. 

Note  that  the  refined  right  subprogram  now  meets  the  constraints  placed  on 
it  by  the  left  subprogram.  That  is,  we  have  F  U  F^i  C  A.  The  refinements  of 
the  left  and  right  subprograms  can  now  be  combined  into  a  refinement  of  their 
parallel  composition.  Refinement  between  the  overall  programs  C4  and  C5  then 
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is  obtained  with  application  of  NEW. 

[Pab,  {Pab,  Qab,  Qs]  U  =  v\v£  N}] 

C4  C5  PAR(n^,Tis).  NEW(/!,t) 

[Qab  ^Qs,Preds{<D)] 

As  before,  the  NEW  rule  has  a  substantial  simplifying  effect  on  the  assumptions 
ruTmi  from  Hq,  Since  k  and  i  are  local,  the  preservation  of  ^  <  n,  /,  and  Tmi 
is  trivially  ensured.  More  precisely,  k  <  n  is  replaced  by{t><n|t;GN}  which 
is  trivially  preserved  because  every  predicate  consists  of  constants  only.  Similar 
arguments  applied  to  /  and  all  predicates  in  Tmi  •  Moreover,  the  requirement 
that  t  =  SA  is  preserved  gives  way  to  the  requirement  that  the  sum  over  A, 
EA,  is  unchanged.  More  precisely,  the  predicate  t  =  EA  in  T  in  TZq  is  replaced 
by  the  set  of  predicates  {EA  =  |  E  N}. 

Refining  C5  into  Ce 

This  refinement  step  will  replace  {A[a],  A[6]}:[P,  Q]  by 

await  P  then  A[a],  A[6]  :=A[a]  —  20,  A[6]  -f  20. 

We  check  the  premises  of  rule  AWAIT-INTRO. 

1.  First,  an  assumption-commitment  formula  for  the  subprogram  to  be  re¬ 
fined  and  its  parallel  environment  is  needed.  We  have 

[Pab,  {Pab,  Qab,  Qs}  U  =  V  |  t;  €  N}] 
D\\{A[a],A[b]}:[P,Q] 

[Qab  A  (5j,Preds(0)] 

where  D  is 

D  ~  while  /?  <  n  do 

f  :=:t  4-  A[k] ;  k:=k  -f  1 
od; 
s:=f. 

2.  Second,  we  need  to  find  an  arithmetic  expression  that  allows  us  to  show 
that  the  parallel  program  D  will  eventually  make  the  await  condition 
true.  Let 


m2  =  cond{k  >  max{ayb)^Oymax(a^b)  —  k 1), 

Clearly,  m2  is  always  nonnegative  and  m2  =  0  implies  the  await  condition, 
that  is,  m2  >  0  and  m2  ^  0  ^  (k  <  aAk  <  b)\/  (k  >  aAk  >  b).  Moreover, 
the  program  running  in  parallel  must  be  shown  to  either  decrease  m2 
infinitely  often  or  at  least  until  m2  is  0.  We  show  this  as  follows.  The 
parallel  program  D  can  be  split  into  D  Di  V  D2  where 

D\  =  {{k  <  n}  ;t:=t A[k]',  k:=k 1)^ 
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and 

D2  =  {{k  <  n}  ;f:=f +  >l[Ar]  ;Ar:=A-l-  I)”"  ;  {Ar  >  n}  ;5:=^ 

using  the  definition  of  while  loops  and  sequential  composition.  Since 
A?  :=  Ar  -f  1  decreases  m2  while  all  other  atomic  statements  in  Di  leave  m2 
unchanged,  Di  decreases  m2  infinitely  often.  Similarly,  D2  decreases  m2 
until  it  is  zero.  The  formal  proof  uses  characteristic  formulas  of  atomic 
statements  to  determine  trace  inclusion,  and  the  congruence  property. 
More  precisely, 

Di  Cjx  {inv*m2  ;  Am^  5  if^v"m2Y  Lemma  2.2 

D2  Cjt  {mv'm2  ;  >lm2  5  inv* 7x12)*  ;  {m2  =  0}  ;  int;* m2*  Lemma  2.2 

Note  that  k  >  n  implies  k  >  max{a,b)  and  thus  m2  =  0.  The  second 
condition  of  rule  AWAIT-INTRO  follows. 


Thus, 

72^10  =  \Pab’i  {Pab'i^  ^  ^-^)Qab^Q s} 

D\\  {A[alA[bmP,Q] 

>-  AWAIT-INTR0(1,2) 

D  II  await  P  then  i4[a],  yl[6]  :=A[a]  —  20,  A[a]  +  20 

[Qab  AQs,Pr€d$(9)] 

where  r,n2  =  {m2<v|t;€N}.  The  declaration  of  k  and  i  needs  to  be  added 
to  72io  to  obtain  the  desired  overall  refinement. 

[Pab,  {Pab,  Qab,  Q,}  U  {S>1  =  V  |  t)  €  N}] 

Cs  Ce  NEW(«,o.fc.O 

[Qad  A(3,,Pre</5(0)]. 

As  in  the  previous  refinement,  the  application  of  the  NEW  rule  simplifies  the 
assumptions  greatly.  Since  k  is  local,  it  cannot  be  changed  by  the  environment 
and  thus  is  trivially  preserved. 

Putting  it  all  together 
By  transitivity  we  get 

[Pab,  {Pab,  Qab,  QJ  U  =  W  |  V  €  N}] 

C\  >-  Cq 

[Qo6  AQ,,Prerfs(0)]. 

With  weakening  and  Lemma  5.5  this  implies  the  desired  result 
{Pad}  ;  C\  5ft  {Pab}  »  Cq 


Lemma  5.3 
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and 


{Pab]  Ce  {Qab^Qs}, 

that  is,  every  execution  of  Ce  starting  in  a  state  satisfying  Pab^  will  also  be  an 
execution  of  Ci.  Moreover,  if  that  execution  terminates  it  does  so  in  a  state 
satisfying  Qab  A  Q^. 

Discussion 

1.  Local  variables  are  not  subject  to  environment  interference.  Declaring  a 
variable  as  local  thus  typically  simplifies  the  environment  assumptions. 
Note,  for  instance,  that  the  loop  invariant  I  is  not  mentioned  in  the  as¬ 
sumptions  of  the  overall  refinement.  This  is  because  it  mentions  the  local 
variables  k  and  t  only  and  thus  is  trivially  preserved. 

2.  Note  that  the  assumptions  only  ask  for  Pab,  Qab^  Qs  and  the  value  ofEA  to 
be  preserved.  This  means  that  Ce  could  be  put  in  a  parallel  context  which, 
for  instance,  atomically  swaps  two  entries  in  A  or  atomically  performs 
another  transfer. 

3.  We  have  shown  that  every  execution  of  Ce  starting  with  Pab  also  is  an 
execution  of  Ci.  A  little  thought  shows  that  the  converse  also  is  true, 
that  is,  that  Ci  and  Ce  have  identical  executions.  Note,  however,  that  in 
its  current  form  our  calculus  does  not  allow  us  to  derive  this. 


6.2  Example:  The  Floyd- Warshall  algorithm 

The  Floyd- Warshall  algorithm  is  a  dynamic  programming  formulation  to  solve 
the  all-pairs  shortest-paths  problem  on  a  directed,  weighted  graph  [CLR90].  Let 
G  be  a  graph  G  =  {V,  E)  with  vertices  V  =  {1, . . .,  n}  and  edges  E  C  V  x  V . 
Also,  let  W  be  an  n  X  n  adjacency  matrix  representing  the  edge  weights  of  G, 
that  is, 

(  0,  if  i  =  j 

=  N  the  weight  of  the  edge  (i,j),  if  i  ^  j  and  {i,j)  G  E 

[  00,  if  i  ^  j  and  (i,j)  ^  E. 

Edges  may  have  negative  weights,  but  we  shall  assume  that  there  are  no  negative- 

weight  cycles  in  G.  Moreover,  let  be  the  length  (weight)  of  the  short¬ 

est  path  in  G  from  vertex  i  to  vertex  j  if  such  a  path  exists.  Otherwise,  let 
S{i^j)  =  oo.  We  will  assume  that  kF[2',  j]  and  thus  also  S{ijj)  are  constants  for 
all  i  and  j,  that  is,  the  adjacency  matrix  does  not  change. 

We  are  to  design  an  algorithm  that  computes  the  length  of  the  shortest  paths 
between  any  two  nodes  in  G.  More  precisely,  we  want  to  compute  the  matrix 
D  such  that  Qd  holds  where  =  VI  <  <  n.D[i^j]  =  S{iyj).  The  initial 
specification  Gi  thus  is 


Cl  =  D:[U,Qd]^ 
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Compared  to  most  expositions  of  this  algorithm  in  the  literature,  in  [CLR90] 
for  instance,  we  will  derive  an  implementation  that  exhibits  more  parallelism. 
The  entire  refinement  is  summarized  in  Figure  6.2  where  new  d  ==  ly  in  C 
abbreviates  new  d[l,  1]  =  W[lj  1], , .  .,d[n^  n]  =  n]  in  C. 

Reflning  Ci  into  C2 

The  first  refinement  step  introduces  a  local  matrix  d  together  with  a  finite  loop 
that  updates  it.  d  has  the  same  type  as  D.  According  to  the  initial  specification 
Cl,  D  can  only  be  updated  once,  d,  however,  can  be  updated  a  finite  number 
of  times  until  it  holds  the  shortest  distances,  that  is,  until 

Qd  =  Vi.'ij.d[i,j]  =  S{i,j). 

An  assignment  D:-d  achieves  the  desired  postcondition. 

Formally,  this  step  is  justified  as  follows.  First,  we  show  that  ;{Qd} 

is  subsumed  by  finite  stuttering  if  changes  to  d  are  ignored. 

nx  = 

skip*^  ;skip  d:[ti^tt]*  ;  {Qd}  atom, star,  seq 
[Qd,  Preds{%)\ 

Next,  the  behaviour  of  £):=d  in  initial  states  with  Qd  is  shown  to  be  subsumed 
by  D:\tt,  Qd].  That  is, 

=  [Qd,{Qd,QD}\ 

D:\lt,QD] 

y  ATOM 

D:=d 

[Qd,  Preds{9)]. 

The  sequential  composition  of  these  two  refinements  yields 

Us  =  [tt,{Qd,QD}] 

D:[tt,QD] 

y  (Lemma  2.1),  Lemma  5.4 

skip*  ;  skip  ;  D:[tt,  Qd] 

!^{d}  SEQ(k,,K3) 

d:[tt,tt]*  ;{Qd]  ;D:=d 
[QD,Preds{<6)]. 


We  now  declare  d  to  conclude  refinement  between  Ci  and  C^. 

[«,{Qd}]  CiyC^  [QD,Preds{9)]  NEW-iNTRO(TC3,<i) 
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Cl  = 

D:[tt,QD] 

where  Qd  = 

VI  <i,j  <  n.D[i,j]  =  5{i,j) 

C2  = 

new  d  =  W  in 
d:[tt,tt]*;  {Qd}; 

D:=d 

where  Qd  = 

VI  <  i,j  <  n.d[i,j]  =  6{i,j) 

C3  = 

new  d  =W  in 
for  Ar  =  1  to  n  do 
d:[tt,tt]*-,  {Q*} 
od; 

D:=d 

where  = 

VI  <  ij  <  n.d[ij]  =  S{iJ,k) 

C4  = 

new  d  =  W  in 
for  A:  =  1  to  n  do 

od; 

D:=d 

C5  =  new  d  = 

W  in 

for  k  = 

1  to  n  do 

\\ij=i,Ahj]  ■■=min{d[i,j],d[i,  A;]  +  d[k,j]} 
od; 

D-.=d 

Figure  6.2:  Derivation  of  an  implementation  of  the  Floyd- Warshall  algorithm 
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Since  d  is  now  local  and  S{i^  j)  is  constant,  Qd  is  automatically  preserved.  More 
precisely,  the  application  of  NEW-INTRO  replaces  the  requirement  to  preserve 
Qd  by  the  requirement  to  never  change  for  all  1  <  <  n.  Since  S(i^j) 

is  constant,  this  is  vacuously  true. 

Refining  C2  into  C3 

The  finite  loop  d:[tt,tt]*  ;  {Qd}  is  refined  into  a  for  loop  with  loop  counter  k 
ranging  from  1  to  n.  The  kih  iteration  is  assumed  to  establish 

=  VI  <  i,j  <  n.d[i,j]  =  S{i,j,  k) 

where  ^(2,  j,  k)  is  the  length  of  the  shortest  path  from  i  to  j  whose  intermediate 
vertices  are  all  drawn  from  {1, . . . ,  k).  If  there  is  no  such  path,  let  j,  k)  =  00. 
Note  that  ^(2,i)  =  n).  Thus,  upon  termination  we  have  QJ,  which  is 

equivalent  to  Also  note  that  Q^  is  equivalent  to  d  = 

Formally,  we  want  to  use  FOR-INTRO.  We  first  check  each  of  its  premises. 

1.  First,  note  that  k  is  never  assigned  to  in  C2  or  C3. 

2.  Second,  each  of  the  iterations  can  be  refined  as  follows  with  Q^~‘^  serving 
as  the  loop  invariant. 


[Q^^{Q4] 

d:[ttyttY 

y 

(Lemma  2.1),  Lemma  5.4 

d:[tt,tt]*  ;  skip 

ATOM,  SEQ 

d:[«,«]*;{Q$} 

[Qdi  PredsiH)] 

for  all  0  <  <  n  —  1. 

3.  Since  there  are  no  negative- weight  cycles  in  the  graph,  each  vertex  in 
the  graph  can  occur  at  most  once  in  the  shortest  path  between  any  two 
vertices.  Thus,  the  postcondition  of  the  last  iteration  implies  the  desired 
overall  postcondition,  that  is,  Q^  Qd. 

Thus, 

[Q3>{Q5lo<fc<”}] 

(Lemma  2.1),  Lemma  5.4 

{d:[tt,ttrr  ^AQd} 

y 

for  Ar  =  1  to  n  do 
[Qd,  Prec/s(0)] . 


FOR-INTRO(l,2,3) 
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We  compose  this  loop  with  the  trailing  assignment  D:=d  sequentially  and 
declare  d.  Formally, 

[hAQd]]  C2  y  Cs  [QD,Preds{iI\)].  atom.seq,  NEW(ci) 

Note  that  the  initialization  d  —  W  establishes  and  that  the  locality  of  d 
guarantees  the  preservation  of  the  predicates  for  all  k. 


Refining  Cs  into  C4 

In  the  kih  iteration  the  statement  d:[tt,tt]*  allows  C3  to  update  d  an  arbitrary 
but  finite  number  of  times  before  <3^  is  established.  This  computation  of  is 
now  refined  into  parallel  processes  using  PAR-INTRO.  Process  {i,j)  computes 

Q’/’*  =  d[i,j]  =  S{i,j,k). 

Note  that  VI  <  i,j  <  implies  Q^.  However,  before  we  can  use  rule 

PAR-INTRO  for  this  refinement,  we  must  ensure  that  each  parallel  process  {i^j) 
preserves  the  postconditions  of  the  other  processes  (r,  s).  Thus,  the 

computation  of  each  of  the  future  parallel  processes  needs  to  be  restricted  ap¬ 
propriately.  Formally, 


114=  d:[tt,ttY 

‘Dq'i  ,  Vi,  j.pre  •  Lemma  2.2 


We  check  the  four  premises  of  rule  PAR- INTRO. 

1.  First,  since  d:[tt,'^i,  j.pre  is  atomic,  its  robustness  follows  directly 

with  Proposition  2.1.  Moreover,  it  also  preserves  using  Proposition  4.1. 

2.  Second,  d:[ttyi^  j.pre  is  refined  into  d[ij  j]:[tt,Qy'^].  Formally, 

d:[tt,'ii,  j.pre  (5'/’*] 

ATOM 

d[i,j]:[tt,Qy'''] 
for  all  i  and  j  where 

A  =  I  1  <  <  n}. 

Note  that  i  and  j  occur  bound  in  the  refined  program  but  free  in  the 
refining  program.  Also,  note  that  is  preserved  for  all  i  and  j. 


3.  The  assumptions  of  process  (i,j)  are  contained  in  the  guarantees  of  its 
environment,  that  is,  of  all  processes  (r,  5)  with  r  f  or  s  7^  j.  Formally, 


C  A,  and  thus, 


c  n 


A  - 

(r, «)=(!, l),(r,«)?£(i,j)^ 


A. 
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4.  The  postcondition  Qj  follows  from  the  conjunction  of  the  postconditions 
QY''’  of  each  of  the  processes,  that  is, 

(VI  <  ij  <  n.Q’/’'’)  => 

Thus,  with  PAR-INTRO, 

1 1  <  <  »}] 

d:ltt,Vi,j.pre  Q'/’'’]* ;  {Q$} 

>-  PAR-INTRO(l,2,3.4) 

[QiPreds(ii)] 

which  implies 

[«,{Qy^’*|l<i,j<n}] 

d:ltt,tt]* ;  {Qa) 

y  (724),  Lemma  3.4 

d:ltt,Vi,j.pre  Q’/'*]*  ;  {Qj} 

y 

[QS,Preds(0)] 

using  7Z4  and  weakening. 

To  obtain  refinement  between  C3  and  C4,  we  build  up  the  remaining  context 
using  the  rules  indicated  below. 

[«,{Qi)}]  €3^04  [QD,  Preds{^)]  for,  seq,  NEW(<i) 


Refining  C4  into  C5 

Each  of  the  parallel  processes  (i,  j)  is  refined  now.  We  use  two  transformations 
to  turn  into  an  executable  program.  The  Floyd- Warshall  algo¬ 

rithm  is  based  on  a  property  of  boolean  matrices  and  relies  on  the  absence  of 
negative- weight  cycles.  For  more  details,  see  [War62,  CLR90].  In  our  setting, 
this  property  is  expressed  as 

J(i,  j,  k)  =  min{S{iJ,k  -  l),S(i,k,k-  1)  -|-  S{kJ,k-  1)}. 

The  above  equation  allows  the  computation  of  the  shortest  distance  between 
i  and  j  via  intermediate  nodes  1  through  k  in  terms  of  the  shortest  distances 
between  any  two  nodes  in  the  graph  via  intermediate  nodes  1  though  k  —  I, 
More  precisely,  the  update  d[ijj]:[U^Qy'^]  in  C4  can  be  replaced  by 

d[ij]  :zzmin{5{ij,k  -  1),  J(i,  Ar,  A:  -  1)  +  rf(Ar,  j.  A:  -  1)}. 
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Formally,  we  have 

7^5  =  dliJ]:[tt,Qy’^] 

=Tt  d[i,j]:=:7nin{S(i,j,k  -  1)  -j-S(k,j,k  -  1)}. 


The  second  transformation  uses  the  loop  invariant.  At  the  beginning  of  the  kth 
iteration,  holds.  Consequently,  d[i,j]  contains  the  length  of  the  shortest 

path  from  i  to  j  via  intermediate  nodes  1  through  k  —  1,  that  is,  S{i^j,k  — 
1).  Thus,  the  current  values  of  d[z,j],  d[i,k]  and  d[k,j]  can  be  used  in  the 
computation  of  the  next  value  of  d[i^j].  However,  since  d[i^k]  and  d[kjj]  are 
also  assigned  to  by  processes  (z,  fc)  and  (k,j)  respectively,  the  non-interference 
condition  complicates  this  refinement  step. 

Formally, 


where 

Pij 

A.-,i 


[■Pij)  Tij] 

d[i,j]  :=tr).in{S{i,j,k  -  k,k-l)  +S{k,j,  k  -  1)} 


ATOM 


d[i,  j]  :=min{d[i,  j],  d[i,  k]  +  d[k,  i]} 

[Q/’  1  Ajj] 

=  I  #  {i,j)]  u  \i  =  kVj  =  k}. 


The  guarantees  Aj  j  need  a  short  explanation.  The  preservation  of  and 

for  (r,  s)  (hj)  follows  readily.  However,  the  additional  preservation  of 
and  if  i  =  k  01  j  =  k  is  surprising.  To  see  why  these  predicates  are 
preserved,  note  that  whenever  i  =  k  or  j  ~  k,  we  have 


min{d[ij],  d[i,  k]  +  d[kj]}  =  d[ij], 

because  d[i,  i]  =  d[jjj]  =  0.  Informally,  the  shortest  path  from  i  to  j  via  1, . . . ,  Ar 
is  as  long  as  the  shortest  path  from  i  to  j  via  1, . . . ,  Ar  —  1,  if  the  intermediate 
vertex  k  is  identical  to  either  the  beginning  or  the  end  of  the  path.  Consequently, 
if  i  =  A:  or  j  =  /?,  the  assignment 


d[i,  j]  :=min{d[i,  j],  d[i,  fc]  +  d[k,  j]} 

does  not  change  the  value  of  (i[z,  j]  and  thus  additionally  preserves  the  predicates 
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Thus, 

d[t,Mu,QT'] 

>-  =^^(^5),  Lemma  5.4 

d[i,j]:=min{S{i,j,k-  l),S{i,k,k-  1) 6{k,j,k  -  1)} 

y  ATOM 

d[i,  j]  :=min{d[i,  j],  4*,  k]  +  4^,  j]} 


Before  we  can  put  all  n  processes  in  parallel,  the  non-interference  require¬ 
ment  needs  to  be  checked.  We  have  C  Ar^s  for  all  (r,  s)  ^  (f,  j)  which 
implies 


r.-,i 


!=  I  l(r,,)=(l,l).(r, ,)?<(.•, i) 


A 


r,a 


as  required.  Thus, 

PAR-N(W4,,) 

\\if=i,Ah  j]  :=Tnin{d[i,  j],  d[i,  Ar]  +  4Ar,  j]} 

[Qd,  Preds(ll>)]. 

Note  that  the  precondition  VI  <  i,j  <  n.Pi  j  can  be  strengthened  to 
The  overall  refinement  follows. 

{Qd}]  C4  >-  Cs  [Qd,  Preds{9)],  for,  seq,  NEW(rf) 


Putting  it  all  together 
By  transitivity  we  get 

[Uj{Qo}]  Cl  y  [Ql?,  Preds(0)].  Lemma  5.3 

With  weakening  and  Lemma  5.5  this  implies  the  desired  result 
{tt}  ;  Cl  Dst  {it}  ;  C5  and  {it}  C5  {Qo}- 

Discussion 

1.  Typical  implementations  of  the  Warshall  algorithm  resort  to  a  for  loop 
rather  than  a  parallel  composition  to  implement  it]* ;  {Qj}.  Program 


6.3.  EXAMPLE:  MAXIMUM  SEARCH 


111 


C5  thus  demonstrates  that  the  sequentiality  is  not  necessary.  Note  that 
the  sequential  implementation  could  easily  be  derived  by  refining  C5  us¬ 
ing  Lemma  2.2.8.  Moreover,  it  could  also  by  derived  directly  using  our 
calculus. 

2,  As  in  the  previous  example,  the  locality  of  the  matrix  variable  d  automati¬ 
cally  shielded  the  derived  programs  from  interference  destroying  the  valid¬ 
ity  of  predicates  involving  d.  Since  most  of  the  refinement  steps  involved 
local  variables  (only  the  final  assignment  D:=d  uses  a  global  variable),  the 
resulting  program  poses  few  assumptions  on  its  parallel  environment.  In 
fact,  program  works  correctly  in  all  parallel  environments  as  long  as 
the  value  of  D[i,j]  is  not  changed  if  it  is  j)  for  all  i  and  j. 

3.  Using  ideas  from  [vdS86],  program  C5  could  be  refined  further  into  a  dis¬ 
tributed  implementation.  The  use  of  refinement  to  turn  a  shared- variable 
implementation  into  a  distributed  message-passing  implementation  will  be 
illustrated  later  in  Section  7. 

6.3  Example:  Maximum  search 

We  are  to  develop  a  program  that  finds  the  maximal  entry  in  an  array  of  integers 
yl[l.  .n]  and  stores  it  in  variable  x.  Formally,  upon  termination  we  should  have 
X  >  A[i]  for  all  1  <  z  <  n  and  x  =  A[j]  for  some  1  <  i  <  abbreviated  as 
7nax{x).  As  we  will  see,  the  nature  of  the  problem  allows  for  a  highly  parallel 
solution  which  gives  rise  to  a  number  of  more  sequential  variants. 

We  start  with  a  program  Ci  that  allows  x  to  be  changed  arbitrarily  an 
arbitrary  but  finite  number  of  times  before  terminating  in  a  state  in  which 
max{x),  that  is, 

x:[U^tt]*  ;  {max[x)}. 

The  first  derivation  of  an  implementation  is  summarized  in  Figure  6.3. 
Refining  Ci  into  C2 

The  program  C2  breaks  the  problem  into  two  sequential  parts.  The  first  part 
updates  a  local  boolean  array  m  such  that  a  certain  relationship  between  an 
entry  in  m  and  the  corresponding  entry  in  A  holds  upon  termination.  More 
precisely,  we  want  m[i]  to  be  true  if  and  only  if  A[i]  contains  the  maximum  of 
A.  The  second  part  will  later  use  this  relationship  to  find  the  maximum  of  A 
and  store  it  in  rc.  This  refinement  step  is  concerned  with  the  introduction  of 
the  first  part.  A  finite  loop  updates  m  such  that  upon  termination  m[i]  is  true 
if  and  only  if  A[i]  is  maximal,  that  is,  m[i]  max{A[i])  for  all  1  <  z  <  n. 
Assuming  that  all  entries  in  m  are  set  to  true  initially,  we  intend  to  use  the 
finite  loop  to  set  m[z]  to  false  whenever  it  has  been  determined  that  A[i]  is  not 
maximal.  Thus,  once  an  entry  has  been  reset,  it  will  never  be  set  again,  that  is. 
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Cl  =  x:[ttyUY  ;  {max(x)} 
where  max{x)  =  VI  <  z  <  n.x  >  A[i] 


C2  =  new  m[l..n]  =  it  in 

Vz.pre -»m[z]]* ;  {/}; 
x:[tt,ti]* ;  {max(x)} 


where  pre  “»m[z]  = 
/  = 
li  = 


-» m[z]=:^  “im[z] 

VI  <  z  <  n.Ii 
m[i]  max(A[i]) 


Cs  =  new  m[l.,n]  =  it  in 

Vz.pre  “im[z]]* ;  {/}; 
llJLiif  m[i]  then  x:zzA[i] 


C4  =  new  m[l..n]  =  ti  in 

I  ||"=i  m[{\:[U,pre  A  pre  /,]*  ;  {/,}]; 
jj"_i  if  m[t]  then  x  :=>l[i]] 


C5  =  new  m[l..n]  =  in 

’  ll"=i  A  pre  -mH]]  ]; 

IIJLi  if  m[i]  then  a::=>l[z]] 

where  /,*  j  =  (^j^A[j]  <  yl[z]  =»  m[z])  A  (A\j]  >  >l[z]  =>  “^m[z]) 


Ce  =  new  m[l..n]  =  it  in 

’  II"=1  [ll"=iif  then  m[i]  :=#]]; 

llJLi  if  m[z]  then  ar:=>l[z]] 


C7  =  new  m[L.n]  =  it  in 

[  ll(”i"L(i,i)  <  ^[i]  then  m[i]:=#]; 

[  ll?=i  then  x:=^[z]] 


Figure  6.3:  Derivation  of  the  first  solution  to  the  maximum  search  problem 
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every  update  of  m[i]  preserves  that  is,  it  satisfies 

pre  -im[7]  —  m{i]^  -i?n[2]. 

We  show  that  when  ignoring  the  updates  to  the  local  variables  m[l]  through 
??7.[n],  the  first  sequential  component  of  C2  just  exhibits  a  finite  number  of  stut¬ 
tering  steps. 

Formally, 

7^1  =  [tt,  Preds{ii)] 

skip 

>-  (Lemma  2.1),  Lemma  5.4 

skip*  ;  skip 

^{m[l],... ATOM,  STAR,  SEQ 

m:[ttjyi.pre  ~»m[2]]*  ;  {/} 

[tt,  Preds{9)] 

where  pre  m[i]  and  I  are  given  in  Figure  6.3.  Moreover,  program 

m:[tijU]*  ;  {max{x)} 

can  easily  be  shown  to  establish  the  desired  postcondition,  because  it  enforces 
it  explicitly. 

7^2  =  [itj{max{x)}] 

x:[tt^U]*  ;  {max{x)} 

y  ATOM,  STAR,  SEQ 

x:[ttjU]*  ;  {max[x)} 

[max{x),  Preds(9)] . 

The  refinements  IZi  and  7^2  are  composed  sequentially. 

7^3  =  [ti,{max{x)}] 

x\[tt,  tt]*  ;  {max{x)} 

)>-  =:.^j(Lemma  2.1),  Lemma  5.4 

skip; 

x:[U^tt]*  ;  {max{x)} 

^{m[l],...  ,m[n]}  S£Q{tZi,'R2) 

m:[U,P]* ;  {/}; 
x:[ttjtt]*  ;  {maic(a?)} 

[max{x  ),  Prec/5(0)] 


114 


CHAPTER  6,  SHARED-^VARIABLE  PARALLEL  PROGRAMS 


To  obtain  refinement  between  Ci  and  C2,  the  variables  m[l]  through  m[n]  are 
declared. 


[ti^  {max(x)}] 

Cl  ^C2 

[max(x),  Preds(9)]  new-intro(723) 


Refining  C2  into  C3 

Before  refining  the  just  introduced  first  part,  we  specify  how  the  relationship 
between  m  and  A  expressed  in  I  should  be  used  to  compute  max{x).  Assuming 
/,  we  will  attempt  to  compute  max{x)  with  maximal  eflSciency,  that  is,  the  loop 
x:[ttytt]*  ;  {max{x)]  will  be  refined  into  a  parallel  composition  of  n  processes. 
We  want  process  i  to  establish  the  postcondition 

Qi  =  max(A[i])  =>  max{x). 

Note  that  VI  <  f  <  n.Qi  implies  max(x).  However,  before  we  can  use  rule 
PAR-INTRO  to  do  this,  we  must  ensure  that  each  parallel  process  i  preserves 
the  postconditions  Qj  of  the  other  processes.  Thus,  the  computation  of  each  of 
the  future  parallel  processes  needs  to  be  restricted  appropriately.  Formally, 

7^4  =  x:[ttjttY  ;  {max(x)} 

D'j-t  x:[tt^\/i.pre  Qi]*  ]  {max{x)].  Lemma  2.2 

Assuming  /,  the  postcondition  Qi  can  be  achieved  by  setting  x  to  some  A[i] 
for  which  m[i]  holds.  Thus,  in  the  application  of  PAR-INTRO  below,  will  refine 
the  computation  of  the  loop  body  from 

x:[tt,^i.pr€  Qi] 


into 

if  m[i]  then  x:=A[i]. 

We  check  the  four  premises  of  the  rule, 

1.  Robustness  of  x:[tij  Mi.pre  Qi]  follows  directly  with  Proposition  2.1.  More¬ 
over,  it  can  easily  be  seen  that  {CQi  |  1  <  i  <  n}  is  preserved  in  all 
contexts  using  Proposition  4.1. 


6.3.  EXAMPLE:  MAXIMUM  SEARCH 


115 


2.  Second,  x:ltt,Vi.pre  Qi]  is  refined  into  the  desired  conditional. 
x:[tt,\/i.pre  Qi] 

)>-  (Lemma  2.1),  Lemma  5.4 

(skip  ;  x:[it,yi.pre  Qi]  V  skip  ;  xiltt^yi.pre  Qi]) 

)>-  (Lemma  2.2),  Lemma  3.4 

if  m[i]  then  x:[ti,\/i.pre  Qi]  else  x:[it,yi.pr€  Qi] 

>-  ATOM.COND 

if  m[i]  then 

for  all  1  <  i  <  n  where 

Qi  =  max{A[i])  max(x)^ 
r*  “  {I, Qi}, and 

A  =  {/}  U  {Q*  I  1  <  ^  <  n} . 

Note  that  the  refinement  preserves  Qi  for  all  i  as  desired.  Moreover, 
the  last  refinement  step  requires  property  I  to  show  that  Qi  holds  upon 
termination  of  the  conditional. 


3.  The  assumptions  of  process  i  are  contained  in  the  guarantees  of  its  envi¬ 
ronment,  that  is,  of  all  processes  j  with  j  ^  2.  Formally,  F,  C  A,  and 
thus, 

L  C  =  A. 

4.  As  mentioned  above,  the  overall  postcondition  max{x)  follows  from  the 
conjunction  of  the  postconditions  Qi  of  each  of  the  processes,  that  is, 


(VI  <  «  <  n,Qi)  ^  max[x). 


Thus, 

[/,  {/,  max{x)}  U  {Qi  |  1  <  ^  <  n}] 
x:\tt,  tty  ;  {max{x)} 

^  D^j(tc4),  Lemma  3.4 

x:\tt, "ii.pre  Qi]*  ;  {max{x)} 

>-  PAR-INTR0(1,2,3,4),  WEAK 

llf^jif  m[i]  then  x\zzA[i] 

\max  (a?),  Preds{%)] 

To  obtain  refinement  between  and  Cs,  we  first  sequentially  add  the  first 
part  that  computes  /,  and  then  declare  the  array  m. 

\tt,  {max{x)}  U  {max{A[i]),  -^max{A[i])  |  1  <  i  <  n}] 

C2^C3 

[max{x),  Pred5(0)] . 


SEQ,  NEW(m) 
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Note  how  the  locality  of  array  m,  and  thus  the  application  of  NEW,  replaces  the 
assumption  that  1  be  preserved  by  the  assumptions  that  the  value  of  max{A[i]) 
never  changes.  Moreover,  the  preservation  of  max{x)  implies  the  preservation 
of  Qi  for  all  i. 

Refining  C3  into  Ca 

We  now  specify  how  property  I  should  be  achieved  by  the  first  part.  The  result¬ 
ing  refinement  and  its  derivation  will  be  of  similar  structure  than  the  previous 
one  from  C2  to  C3.  Again,  we  will  attempt  to  compute  1  as  efficiently  as  pos¬ 
sible.  The  loop  Vi.pre  “im[2]]*  will  be  replaced  by  a  parallel  composition. 
As  before,  the  desired  postcondition  /,•  for  each  parallel  process  i  must  be  de¬ 
termined.  Let 

li  =  m[i]  ^  max{A[i]), 

Note  that  VI  <  f  <  nJi  implies  7.  Before  the  loop  can  be  refined  into  a  parallel 
composition  using  PAR-INTRO,  we  must  ensure  that  the  postconditions  are 
preserved  by  each  of  the  parallel  processes.  Moreover,  /,■  is  not  trivial  enough  to 
be  established  in  a  single  step.  Thus,  we  allow  each  parallel  process  an  arbitrary 
but  finite  number  of  steps  rather  just  one  as  in  the  case  of  C2.  Formally, 

7^5  =  Vf.pre  ;  {max(x)} 

Lemma  2.1 

Vi.pre  “«m[i]]*)*  ;  {max{x)} 

Lemma  2.2 

(rn:[tij\/i.pre  ->m[i]  A  pre  /i]’*')*  ;  {max(x)]. 

We  will  require  /,•  to  be  achieved  by  updating  m[i]  an  arbitrary,  but  finite 
number  of  times  in  such  a  way  that  and  R  are  preserved  until  holds. 

More  precisely,  in  the  application  of  PAR-INTRO  below 

Tn:[tt^yi,pre  A  /,*]* 


is  replaced  by 

[  ||"_i  m[i]:[tt,pre  ->m[i]Apre  /,•]* ;  {/,}]. 

We  check  the  four  premises  of  the  rule. 

1.  Since  'ii,pre  Apre /,]*  is  a  finite  loop  over  an  atomic  statement, 
robustness  follows  with  Proposition  2.1.  Moreover,  it  also  preserves  {/,•  | 
1  <  i  <  «}  in  all  contexts.  To  see  this,  we  use  Proposition  4.1. 

2.  Second,  each  of  the  parallel  programs 

m[t\\\ti,pre  A  pre  /,]*  ;  {/»} 
is  shown  to  be  subsumed  by 

Vi.pre  -nm[2]  A  pre  /,]*, 
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that  is, 

[«,{/,}] 


m:[tt,yi.pre  “>m[z]  A 

V  pre 

■liV 

-rt 

(Lemma  2,2),  Lemma  3.4 

m[i]:[tt , pre  A 

pre 

liY 

—ri 

(Lemma  2.1),  Lemma  5.4 

m[i]:[tt,pre  ~im[z]  A 

pre 

liY 

;  skip 

ATOM,  STAR,  SEQ 

m[i]:[tt,pre  -^nn[i]  A 

pre 

liY 

;U} 

:.A] 

for  all  1  <  z  <  n  where  A  “  {/f  |  1  <  z  <  n}. 

3.  Third,  the  non-interference  condition  is  met,  because  {/f}  C  A.  Thus, 

{'.)  c  n”=u„A  =  A. 

4.  Fourth,  as  mentioned  above,  the  conjunction  of  the  postconditions  of  each 
of  the  processes  implies  the  desired  overall  postcondition. 

(Vz./i)  ^  I 


Thus, 

[tt,  {/i  I  1  <  i  <  n}] 

m:[tt,yi.pre  ->TO[i]A]*  ;  {/} 

D.^j(7e5),  Lemma  3.4 

(m:[tt,yi.pre  -im[z]  A  pre  /j]*)*  ;  {/} 

PAR-!NTR0(1,2,3,4),  WEAK 

[  llr=i  ^n[{\:[it,pre  ^m[i]  A  pre  li]*  ;  {li}] 

[/,  Preds{ib)] . 

Refinement  between  C3  and  C4  follows  with  the  indicated  rules. 

[tt,  {maic(a^)}  U  {7nax(A[i]),~^max{A[i])  |  1  <  z  <  n}] 

C3  >-  C4  SEQ.  NEW 

[max{x),  Prec/s(0)] 


Refining  C4  into  C5 

The  refinement  step  above  has  broken  the  computation  of  I  into  a  parallel  com¬ 
position  of  processes  where  process  i  is  required  to  establish  li  while  changing 
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only  7n[i]  and  preserving  and  More  precisely,  process  i  is  of  the  form 

m[i]:[itjpre  A  pre  7|]*  ;  {/,}.  This  loop  will  be  implemented  by  another 

parallel  composition.  Each  of  the  processes  (z,  j)  will  compute 

lij  =  {max(A[i])  =>  m[i])  A  A[i]  <  A[j]  =>  ■->m[z]. 

Note  that  VI  <  j  <  n.Iij  implies  To  prepare  for  the  application  for  PAR- 
INTRO,  we  need  to  ensure  that  process  (i,j)  preserves  the  postconditions  of 
the  other  processes.  It  turns  out,  though,  that  no  additional  requirements  are 
need.  To  see  this,  assume  Uj,  Let  {kj)  be  a  process  in  the  environment  of 
(z,  j).  If  Ar  ^  z,  then  (Ar,/)  does  not  modify  m[z]  which  implies  preservation  of 
If  Ar  =  z,  then  (A:,/)  preserves  ^[z]  <  A[j]  =>  -»m[z],  because  it  preserves 
-im[z].  Moreover,  it  also  preserves  max{A[i])  m[z],  because  it  preserves  /,• 
by  assumption.  Thus,  no  additional  requirements  need  to  be  added.  Moreover, 
assuming  m[z]  is  set  initially,  /,•  j  can  be  established  in  a  single  step  by  resetting 
m[z]  if  A[i]  <  A[j].  We  check  the  premises  of  rule  PAR-INTRO. 

1.  Since  m[i]:[U^pr€-im[i]Apre  /,*]  is  atomic,  robustness  follows  directly  with 
Proposition  2.1.  Preservation  of  {Uj  \  I  <  i^j  <  n]  follows  directly  with 
Proposition  4.1. 

2.  Second,  we  derive 

pre  -'m[z]  A  pre  /,] 

>“  (Lemma  2.2),  Lemma  3.4 

m[i]:[«,  lij] 

for  all  j  where  A  =  {Rj  |  1  <  «,  J  <  n}.  Note  that  the  above  refinement 
preserves  Ii^rn  for  all  1  <  l^m  <  because  only  m[z]  can  be  changed  and 
if  every  such  change  results  in  a  state  satisfying  Uj. 

3.  The  environment  of  (z,  j)  guarantees  the  preservation  of  Rj  (the  assump¬ 
tion  of  process  (z,i)),  that  is, 

{A.i}ca"=U;..A  =  A. 

4.  Finally, 


Thus, 

R'6,i  =  [iiyLi] 

m[i]:[tt^pre  A  pre  /j]*  ;  {/,} 

)>- 


||^^im[z]:[tt,  lij  A  pre  ‘^m[z]] 


PAR-INTR0(1,2,3.4) 
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where 


r,-  =  {lij  I  1  <  i  <  n} 

A  =  {/i.j  I  1  <  *,i  <  «}• 

Each  of  these  n  parallel  compositions  are  now  put  in  parallel  resulting  in  nested 
parallelism. 

\\i=im[i]:[U,  lij  A  pre  -.m[i]]*  ;  {/,} 

PAR-M(K6,i) 

ll"=i[||j=i»^[*]:[«>  h,j/\pre  -nm[j]]] 

The  overall  refinement 

[tt^  {7nax{x)}  U  {-^max{A[i])^  A[i]  >  A[j]  |  1  <  *,i  < 

C4  y  C5  WEAK,  SEQ,  NEW 

[max{x),  Preds{ii)] 

follows. 

Refining  C5  into  Cq 

We  now  refine  lij  A  pre  -im[2]].  This  program  will  do  nothing  else  but 

reset  m[i]  in  situations  in  which  this  reset  makes  lij  true,  that  is,  if  and  only  if 
A[j]  is  less  than  A[i].  The  only  candidate  for  the  refinement  thus  is 

if  A[i]  <  A[j]  then  m[i]  :=ff. 

Formally,  we  derive 

A[{\  <  Ay]}] 
m[i]:[tt,  Pi  A  lij] 

y  (Lemma  2.1),  Lemma  5.4 

skip  ;  Uj  A  pre  -im[2]]  V  skip  ;  rn[i]:\tt^  lij  A  pre 

Lemma  3.4 

if  A[i]  <  A[j]  then  m[i]:[tt^  Pi  A  lij] 
else  m[i]:[it,  Pi  A  lij] 

y  ATOM,  COND 

if  A[i]  <  A[j]  then  m[i]  ::=JJ 
[lij.PredsiH)]. 

Note  that  the  last  refinement  step  relies  on  m[i]  being  set  initially  to  show  the 
the  conditional  establishes  the  postcondition  lij.  To  obtain  refinement  between 
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Ce  and  C7,  the  nested  parallel  context  needs  to  be  built  up. 

[m[i],  <  A[j]  |  1  <  J  <  n}] 

\\%irn[i]:[tt,PiMij] 

>-  PAR-N 

/l[i]  <  i4[i]  then  m[i]:=ff 
[I,  Pre(/s(0)] 

and 


[Vi.m[i],{m[i],7,j,.4[i]  <  A\j]  \  l<i,j  <  n}] 

>-  PAR-N 

ll"=i[||j=iif  <  ^\j]  then  m[{\:=ff] 

[l,Prtds{<ll)]. 

Applications  of  SEQ  and  NEW  conclude  this  refinement  step 


Ur  =  [ti,  {max{x)}  U  {-^max{A[i]),A[i]  <  A[j],A[i]  >  A[j]  \  l  <ij  <  n}] 
C5  Ce  SEQ,  NEW 

[max(x),  Preds(0)] . 


Refining  Ce  into  C? 

Since  parallel  composition  is  associative,  nested  parallelism  as  in  ||”_j[||”_iC,  j] 
can  be  flatten  out,  that  is, 


i?=i[ 


=T* 


|(r>.n)  ^ 


for  all  Cf  j  .  Thus,  Tig  =  Ce  =7-j  C7.  Using  the  previous  approximation  we  get 


\ti,  {max{x)}  U  {-^max{A[i\) ,  A[i]  <  A\j],A[i\  >  A\ji\  |  1  <  i,j  <  n}] 
C5  y  C7  Lemma  5.4(727, Tig) 

\rnax{x)^  Preflfs(0)] 


Putting  it  all  together 
Thus, 

[«,{max(a:)}  U  {j4[i]  <  A[i],A[{\  >  A[j]  |  1  <  <  n}] 

Cl  >“  C7  Lemma  5.3 

[max(x),  Preds(0)] 
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where  the  assumptions  are  a  simplification  of 

{max{x)} 

yj{Qi  I  1  <  «  <  n} 

U{-^max{A[i]),max{A[i]),A[i]  <  >  A[j]  \  1  <  i,j  <  n}. 

Discussion 

The  assumptions  in  the  final  refinement  statement  are  worth  analyzing.  Clearly, 
if  the  array  A  never  changes,  then  all  predicates  in  T  are  preserved,  that  is, 

r  C  Preds{{A]). 

However,  this  is  more  restrictive  than  necessary.  T  does  leave  some  room  for  A  to 
be  changed.  More  precisely,  every  environment  transition  that  preserves  A[i]  < 
A[j]  for  all  I  <  i,j  <  n  also  obeys  F  and  is  thus  permissible.  Consequently, 
the  parallel  environment,  can,  for  instance,  subtract  a  non-negative  number 
from  the  minimum,  or  add  in  one  atomic  step  a  positive  number  to  every  entry 
without  violating  the  assumptions  and  destroying  the  validity  of  the  refinement. 

6*3.1  Alternative  refinements 

Deriving  a  sequential  implementation 

The  implementation  we  have  derived  is  highly  parallel.  Given  our  calculus  it 
is  straightforward  to  specialize  it  into  a  sequential  implementation.  Consider 
Figure  6.4.  The  trace  inclusion  Ce  C"?  is  obtained  with  Lemma  2.2.8  to 
decrease  parallelism. 


Ce  =  new  m[l..n]  =  ti  in 

„  ll"=i  [ll"=iif  A[{\  <  A[j]  then  m[{\:=jf]]-, 

||f_i  if  m[i]  then  a?:=A[z]] 

dr  = 

new  m[l..n]  =  tt  in 

for  i  =  1  to  n  do 

for  j  =  1  to  n  do 

if  A[i]  <  A[j]  then  m[i]  ■.=ff; 

for  ^  =  1  to  n  do 

if  m[i]  then  x:=A[i] 

Figure  6.4:  Derivation  of  the  second  solution  to  the  maximum  search  problem 


Programs  C7  on  the  one  hand  and  C7  on  the  other  represent  two  extremes 
on  a  spectrum  from  maximally  parallel  to  maximally  sequential.  Various  mixed 
implementations  could  be  derived  using  Lemma  2.2.8. 
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Avoid  multiple  updates  to  m[i] 

Program  Ce  implements  the  initial  specification  Ci  by  first  spawning  n?  parallel 
processes  Cij  each  of  which  executes 

if  A[i]  <  A[j]  then  m[i]:^ff. 

The  coupling  between  these  parallel  processes  is  very  loose,  that  is,  they  neither 
influence  each  other  nor  depend  on  each  other’s  behaviour.  Consequently,  the 
assignment  m[i]:=ff  may  be  executed  several  times.  For  instance,  if  A[i]  happens 
to  be  the  least  element  in  A,  7n[z]  is  set  to  false  1  times.  In  a  truly  concurrent 
setting  with  available  processes,  we  are  not  penalized  for  this  redundancy. 
However,  if  we  do  not  have  processes  or  parallelism  is  simulated  on  a  single¬ 
processor  machine,  one  may  wish  for  an  implementation  that  is  more  efficient  in 
the  sense  that  it  executes  each  m[i]  :=ff  at  most  once.  Figure  6,5  summarizes  the 
derivation  of  an  alternative  implementation  Cq.  The  reduction  in  redundancy 


Cs  =  new  m[l..n]  =  ti  in 

’  ll"=i  [ll"=i”»H :[«,  A  pre  -’m[*]]]  ] ; 
ll?=i  then  x:=A[i]] 

where  lij  =  (Vj.A[j]  <  A[i]  ^  ui[i])  A  (A[j]  >  A[i]  ~im[i]) 


Cq  =  new  m[l..n]  =  it  in 

ll?=i  [||^=i^w^*t  it  then  if  m[i]  A ^[i]  <  A[j]  then 
11 -Lj  if  m[i]  then  x:=A[i]] 


Figure  6.5:  Derivation  of  the  third  solution  to  the  maximum  search  problem 


in  Cq  must  be  paid  for  by  tighter  coupling  and  decreased  parallelism. 

Program  Cq  assigns  false  to  m[i]  at  most  once,  but  still  executes  the  test 
m[i]  A  A[i]  <  A\j]  n  —  I  times.  Figure  6.6  summarizes  a  second  alternative 
refinement  that  improves  on  this  and  executes  the  assignment  at  most  once  and 
the  test  at  most  n  —  I  times  but  possibly  less  often.  Program  Cq*  tests  if  there 
is  an  element  j  in  A  that  is  greater  than  A[i]  in  one  high-level  step.  Program 
C7'  implements  this  by  scanning  A  sequentially  until  either  all  elements  have 
been  looked  at  or  a  greater  element  is  found.  It  thus  minimizes  the  number 
of  redundant  assignments  and  tests,  at  the  cost  of  an  even  lower  degree  of 
parallelism. 

An  overview  of  all  refinements  presented  in  this  section  can  be  found  in 
Figure  6.7. 
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C4  =  new  m[l..n]  =  ti  in 

||"=i  m[i]:[tt,  pr^m[i]  A  pre  /,]*  ;  {/,}]; 
jj"_j  if  m[i]  then  a;;=yl[i]] 


C5"  =  new  m[l..n]  =  in 

.  Il?=i  rn[i]:[U,pre-^m[i]  A  pre  R] ;  {/*}]; 
ii?=i  if  then  a:  :=:74[i]] 


Cq^  =  new  7?7.[l..n]  =  it  in 

[  ll”=i  if  <  A[j]  then  m.[i\:=jf]; 

llr=i  if  ^[i]  then  a::=^[i]] 


Cy'  =  new  m[l..n]  =  ti  in 

^  new  ki  =  1  in 

[  while  A[ki]  <  A[i]  A  ki  <  n  do  ki  :-ki  1;  ] ; 

if  ki  <  n  then  m[i]  :-ff 
[  |[f_i  if  m[i]  then  a::=^[f]] 


Figure  6.6:  Derivation  of  the  fourth  solution  to  the  maximum  search  problem 
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Figure  6.7:  Overview  of  solutions  to  the  maximum  search  problem 
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6.4  Example:  Array  search 


Let  A[l..n]  be  an  array.  Let  search{A)  be  the  smallest  index  of  an  element 
satisfying  some  predicate  P,  if  it  exists.  Let  search{A)  =  n  +  1  otherwise. 
Formally, 


searc 


min{k  |  P{A[k])}^ 
n  +  1, 


if  P{A[k])  for  some  I  <  k  <  n 
otherwise. 


We  seek  to  develop  a  program  C  that  computes  s€arch{A)  and  stores  it  in  x. 
The  initial  specification  Ci  thus  is 


Cl  =  x:=search{A). 

Moreover,  we  want  C  to  consist  of  m  <  n  parallel  subprograms.  For  simplicity, 
we  assume  that  m  divides  n.  The  zth  subprogram  will  search  the  array  entries 
z,  z  -f  m,  i  +  2m, . . . ,  z  -f  n  looking  for  x.  We  will  call  the  entries  I  <  j  <  n  such 
that  j  =  i  {mod  m)  the  ith  partition  of  A^  that  is, 

partm,n{'^)  =  {i  1  1  £  i  <  ^  Aj  =  z  {mod  zn)}. 

This  problem  has  occurred  frequently  in  the  literature  on  reasoning  about 
concurrent  programs.  The  binary  case  (m  ==  2)  is  discussed  in  [OG76a,  A091]. 
Stirling,  however,  also  treats  the  general  case  [Sti88].  In  contrast,  we  will  also 
derive  alternative  implementations.  The  entire  first  refinement  is  summarized 
in  Figure  6.8. 


Refining  Ci  into  C2 

As  in  the  development  of  the  maximum  search  algorithm,  the  problem  will  be 
split  into  two  sequential  parts  where  the  first  part  will  establish  a  predicate  that 
will  allow  a  straightforward  computation  of  the  desired  result  in  the  second  part. 
This  refinement  step  will  introduce  the  first  part.  The  idea  is  to  have  m  local 
variables  yi  through  i/m  where  yi  ranges  over  partition  z.  Each  variable  pi  will 
be  initialized  with  n  +  z  indicating  that  no  entry  satisfying  P  has  been  found 
yet  in  partition  z.  The  value  of  these  variables  can  be  changed  in  a  terminating 
loop  in  a  nonincreasing  fashion.  Upon  termination  of  this  loop,  the  desired 
index  search{A)  should  be  given  by  the  minimum  over  yi  through  pm,  that  is, 

Q  =  search{A)  =  min{yi,,..,yn}. 

Assigning  mzn{z/i, . . .  ,z/m}  to  x  thus  leaves  x  with  the  desired  value. 
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Cl  =  x:=search(A) 


C2  = 


where  Q  = 


new  yi  =  n  +  1, ...  ,ym  =  n  +  m  in 

x:=min{yi,...,yr„} 
min{yi ,.. .,  ym}  =  search{A) 


C3  =  new  j/i  =  n  +  1, . . .  =  n  +  m  in 

[  lli^i  yr-lHyVi  <i«]* 
x:=min{yi,...,ym} 

where  Qi  =  Ri  yi  =  search(A)  \  yi  >  search{A) 
Ri  =  search(A)  £  partm,n{i) 


C4  =  new  2/1  =  n+ =  n  +  m  in 
II  m  new  Xi  =  i  in 

II  1=1  {xi,yi):[tt,xi  >Xi  Ayi  <y<]*  ;  {Q,}  J  ’ 

x:=min{yi,...,ym] 


Ch  =  new  yi  =  n  +  1, ... ,ym  =  n  +  m  in 
m  new  X,  =  i  in 

({x,-  <  yi  A  /,}  ;  {xj,  j/, •}:[«,  x<  >x;-  Ayi  <yj])*; 

L  {Qi} 

x:=min{yi,...,ym} 
where  /,•  =  //  A  If  A  If 

1}  =  Xj  =  i  (mod  m)  A  (Vj  €  partm,n{i)  j  <  Xi  =>•  “'■P(^[i])) 

If  =  yi  <n=>  Xi  =  yi  A  -P(>l[2/t]) 

If  =  j/,-  =  n  +  i  V  j/i  <  n 


Ce  =  new  yi  =  n  +  l, . .  .,ym  =  n  + m  in 
^  new  X,-  =  i  in 

while  Xi  <  yi  do 

«=i  j£  then  yi  :=x,'  else  x/  :=x,-  +  m 

x-.^min{yi,...,ym] 


Figure  6.8:  Derivation  of  the  first  solution  to  the  array  search  problem 
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Formally,  we  derive 

7vi  =  [ti^{Q,x  =  search{A)}] 

X  :=:search{A) 

(Lemma  2.1),  Lemma  5.4 

skip*  ;  skip; 

X  :=search{A) 

ATOM.  STAR,  SEQ 

x:=min{yi,...,ym] 

[x  =  search(A) ,  Pr€ds(9)] . 

Declaring  yi  through  ym  yields 

[tt,  {x  =  search(A)}  U  {search{A)  —  v  \v  E  N}] 

Cl  y  C2  NEW-INTRO(7^l,  yi,  ....ym) 

[x  =  search  (A),  Prec/s(0)] . 

The  declaration  replaces  the  requirement  to  preserve  Q  in  IZi  by  the  require¬ 
ment  that  search{A)  must  not  change. 

Refining  C2  into  C3 

The  computation  of  Q  is  refined.  The  loop  {1/1, . .  ■ ,  Vi.y*  <yi]*  ;  {(?}  is 

refined  into  a  parallel  composition  of  m  processes  where  process  i  is  responsible 
for  computing  Qi ,  where 


Qi  =  Ri  =>  yi  =  s€arch{A)  \  yi  >  search{A) 

Ri  =  search{A)  E  partm^ni'^) 

and  Pif  Pthen\Peise  abbreviates  {Pif  ^  Pthen)  A  {-^Pif  ^  Peise)  as  before. 
Informally,  Qi  requires  y*  to  carry  the  value  yi  ~  search{A)  if  s€arch{A)  is  in 
the  ith  partition,  and  some  value  greater  than  or  equal  to  search(74)  otherwise. 
Note  that  this  particular  choice  of  Qi  guarantees  that  Q  is  implied  once  all 
parallel  processes  terminate.  However,  before  the  loop  can  be  broken  into  a 
parallel  composition  using  PAR- INTRO,  we  need  to  ensure  that  each  process  i 
preserves  the  postconditions  Qj  of  the  other  processes  j.  Moreover,  process  i 
will  require  several  steps  to  compute  Qi, 


■^2  =  {yi,--.,ym)-[tt,'ii.yi  <m]* 

=T*  <yi]*)* 

({2/1  >  •  •  • )  ym}-[ttyi-yi  <yi  Apre  Q,]*) 


Lemma  2.1 
Lemma  2.2 
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We  specify  that  process  i  achieves  Qi  in  a  finite  loop  while  updating  only  yi  in 
a  nonincreasing  fashion  and  preserving  Qj  for  all  j. 

We  check  the  four  premises. 

1.  Since 

{yi ,  •  •  • .  ym  } :[«,  Vf.j/j  <yi  Apre  Q,]* 

is  a  finite  loop  over  an  atomic  statement,  it  is  robust  using  Proposition  2.1. 
It  also  preserves  {Qi, . . . ,  Qm]  in  all  contexts  (Proposition  4,1). 

2.  Second,  we  verify  that  each  of  the  parallel  programs 

Vi  <yi  ^pre  <5,]*  ;  {Q,} 


approximates 

Formally, 


{yi ,  •  •  • ,  ym  } :[« ,  Vi.y<  <yi  AQ,]* . 


{yi ,  ■  •  • .  yn } :[« ,  Vi.j/,-  <yi  Apre  Q,]* 

>-*  (Lemma  2.1),  Lemma  5.4 

ViAttyVi  <yi  Apre  Qi]*  ;skip 

ATOM,  STAR,  SEQ 

yi:[ii,yi  <yi  Apre  Qi]*  ;  {Qi} 

[Qi,A] 

for  all  i  where  A  =  {Qi , . . . ,  Qm } • 

3.  Moreover,  the  non-interference  condition  holds,  since  {Q,*}  C  A. 

4.  Finally,  the  conjunction  of  the  postconditions  Qi  implies  the  overall  post¬ 
condition  Q,  that  is,  (Vi.Qi)  =>  Q. 

Thus, 

[tt,  {x  =  search(A)}  U  {Qi  |  1  <  *  <  m}] 

;{Q} 

Lemma  3.4 

({yi,  • .  •  ,ym}:[«,Vi.j/,-  <yi  Apre  Qi]*)*  ;  {Q} 

PAR-INTRO(1.2,3.4) 

il<^iy<:[<<.  yi  <yi  Apre  Q,]*  ;  {Q,} 

[Q,  Preds{l/))] . 

The  proof  is  completed  as  follows 

Tls  =  [t/,  {x  =  search{A)}  U  {search{A)  =  v  |  t;  6  N}] 


C2  y  C3  SEQ,  NEW(yi,...,y^) 

[x  =  search(A),Preds{9)], 

Note  that  locality  of  yi  through  ym  ensures  preservation  of  Qi  for  all  1  <  i  <  n. 
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Refining  C3  into  C4 

Each  of  the  parallel  processes  in  Cs  requires  us  to  establish  a  state  such  that  Qi 
holds  after  a  finite  iteration  in  which  yi  changes  in  a  non-increasing  fashion,  all 
other  variable  are  unchanged,  and  Qi  is  preserved.  To  this  end,  we  introduce  an 
auxiliary  variable  Xi  that  ranges  over  the  ith  partition.  Since  only  yi  is  allowed 
to  change,  xi  must  be  declared  local.  Assuming  that  the  array  is  searched  in 
direction  of  increasing  indices,  Xi  is  initialized  with  %  —  the  smallest  index  equal 
to  i  modulo  m.  This  yields  program  C/^.  The  correctness  of  this  step  follows 
from 


[it,  Preds(0)] 

{Viitt,  yi  <yi  Apre  Qi] ;  {Q.})* 

>-  ATOM,  NEW-INTRO(ii) 

new  Xi  =  i  in 

{{xi,yi]:[tt,Xi  >Xi  Ayi  <yi  Apre  Qj]  i  {Qi})* 

[tt,  Preds(0)] 

for  all  1  <  i  <  n.  This  implies  trace  inclusion  by  Lemma  5.5.  The  overall 
refinement  7^4  =  C3  C4  then  is  established  using  congruence.  Using  the 
previous  refinement  IZs  and  weakening  this  implies 

7^5  =  [tt,  [x  =  search{A)}  U  {search{A)  =  v  \v  E  N}] 

C2  X  C^4  Lemma  5.4(7^3,7^4) 

=  search{A),  Preds(ii)Y 


Refining  C4  into  C3 

This  refinement  step  prepares  the  replacement  of  the  finite  loop  by  a  while  loop 
using  rule  WHILE- INTRO.  To  this  end,  we  need  to  identify  a  loop  termination 
condition  B  and  loop  invariant  U  such  that  the  conditions  of  the  rule  hold.  Let 

B  =  Xi  <  yi 

li  =  IfAlfAlf 

1}  =  Xi  =  i{mod  m)  A  (Vj  G  partm,n{i)-j  <  *1  =>  -iPCylL?])) 

If  =  yi  <n=^  Xi  =  yi  A  P{A[yi]) 

If  =  yi  -  n  +  iV  yi  <  n. 

The  correctness  of  the  refinement  from  C4  into  then  follows  from: 
{xi,yi}.[U,Xi  >Xi  Ayi  <yi  Apre  Qi]*  ;  {Qi} 

—'j't  Lemma  2.1 

(skip ;  {xi,yi}:[tt,  Xi  >Xi  Ayi  <yi  Apre  Qi])*  ;  {Qi}. 

Lemma  2.2 

({»!  <  Vi  A  li]  ;  {xi,yi]:[tt,  Xi  >Xi  Ayi  <yi  Apre  Qi])*  ;  {Qi}. 
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Thus,  by  congruence,  Tie  =  C4  Cjt  Ce  which  implies 

[ttj  {aj  =  search{A)}  U  {search(A)  =  i;  1 1;  €  N}] 

C2  y  Ce  Lemma  5.4  (Ks.Tjg) 

[x  =  searck(A),  Preds(il\)], 
by  refinement  He  and  weakening. 

Refining  C5  into  Ce 

The  finite  loop  in  C5  is  refined  into  a  while  loop.  The  three  premises  of  rule 
WHILE-INTRO  are  checked. 

1.  We  refine  the  loop  body  and  prove  that  /,*  is  an  invariant. 

[xj  <  yi  A  /|,  r,] 

{xi,yi}:[tt,xi  >Xi  Ayi  <yi  Apre  Qi] 

y  =.y-j (Lemma  2.1),  Lemma  5.4 

(skip;  }:[«,*<  >*;  Ay,-  <yi  Apre  Q,]) 
V(skip;{a;<,y,}:[«,a:,-  >Xi  Ay,-  <j^-  Apre  Q,]) 

y  (Lemma  2.2),  Lemma  3.4 

if  P(yl[x<])  then  {a;,-,  y, •}:[<<,  i,-  >Xi  Ayi  <yi  Apre  Q,] 
else  {x,-,y, •}:[«,  x,-  >xi  Ay,-  <yi  Apre  Qi] 
y  ATOM, COND 

if  P(^[xi])  then  y,-  :=x,-  else  x,-  :=x,-  +  m 

[A-.A.-] 

where 


r,-  =  {x.-<y.-,/.-,P(4x.-]),-.P(4x.-])} 

A,-  =  {/,-}  U  Preds{A,  xj,  yj  \  j  5^  i}. 

Thus,  process  i  preserves  all  predicates  over  variables  A,  Xj,  yj  with  j  ^  i. 

2.  Next,  we  show  that  Qi  holds  upon  termination  of  the  loop,  that  is,  x,-  > 
yi  A  li  Qi  holds.  To  this  end,  assume  Xi  >  yi  A  /,♦.  Due  to  /f ,  we  only 
need  to  distinguish  two  cases. 

yi  =  n-\-  i:  Thus,  x,*  >  n  -j-  i.  By  7/  this  means  that  there  is  no  j  between 
1  and  n  such  that  j  =  i  {mod  m)  and  P{A[j])y  that  is,  “iP(A[j])  for 
^-ll  1  S  7  ”  with  j  =  i  {mod  m).  By  definition  of  search{A)^  we 

must  have  ’■^Ri.  But  since  i/t  =  «  +  *  and  search{A)  <  n  -f  1  by 
definition,  we  also  have  yi  >  $earch{A), 

yi  <  n:  Then,  with  If  we  have  x,-  =  yi  and  P{A[yi]).  Due  to  7/  there 
is  no  index  xj  that  is  less  than  x,-  and  for  which  x,-  =  x\  {mod  m) 


6 A.  EXAMPLE:  ARRAY  SEARCH 


131 


and  P[A[xl]).  If  Ri,  then  we  must  have  yi  =  search{A)  due  to 
the  definition  of  search{A).  If  ^Ri,  then  i  ^  search(A)  {mod  m). 
(Note  that  search{A)  >  n  -j-  1  is  impossible  since  we  already  have 
found  yi  <  n  such  that  P{A[yi])).  Again,  by  definition  of  search(A), 
yi  >  search{A). 

3.  Moreover,  we  need  to  find  an  arithmetic  expression  that  allows  us  to  prove 
termination  of  the  while  loop.  Let  that  expression  be  mi  with 

_  /  Vi  -  3:i,  if  yi  >  Xi 

^  \  0,  otherwise. 

Then,  satisfies  the  properties  required  by  the  rule.  More  precisely, 
mi  is  always  nonnegative  and  =  0  implies  the  violation  of  the  loop 
condition,  that  is,  mi  >  0  and  m*  ~  0  Furthermore,  the  loop 

body  does  decrease  mi,  because 

{inv*mi  ;  ami  ;  inv*mi)'^ 

Dj-t  inv  mi  ;  Um,  def:  c*, 

=7-*  inv  mi  ;  Umi  V  inv  mi  ;  arm  ^  ^2 

Dj-t  {P{A[Xi])}  -^yil^Xi  V  {’-^P{A[Xi])}  ;Xi:=Xi  +  m  Lemma2.2 
='j-t  if  P{A[xi])  then  yi  :=Xi  else  Xi  ::=Xi  +  m.  def:  if 

Note  that  the  assignment  Xii^Xi  -f  m  always  increases  Xi  since  m  is 
assumed  to  be  a  constant  greater  than  0. 

An  application  of  the  WHILE-INTRO  rule  then  yields 

Re  =  [a*,  Fj  U  {Qi}  U  {mi  =n\ne  N}] 

{{xi  <yi  Mi}',  {Xi,yi)\[u,  Xi  >Xi  Ay,-  <%•  Apre  Q,])*  ;  {Q,} 

WHILE-INTRO(l,2,3) 

while  Xi  <  yi  do 

if  P{A[xi])  then  yi  :=Xi  else  Xi  :=Xi  +  m 

[Qi,Ai]. 

The  declaration  of  Xi  leads  to 

[y,-  =  n  +  i,  {Qi,yi  =  v,  P(A[?^]),  -iP(A[v])  |  v  e  N}] 
new  Xi  =  i  in 

({x;  <  yi  A  /,)  ;  {a;j,  y, •}:[«,  Xi  >Xi  Ay*  ;  {Qi} 

;>•  NEW(TC6,!Cj) 

new  Xi  ~  i  in 
while  Xi  <  yi  do 

if  P{A[xi])  then  yi  :=Xi  else  Xi  :=Xi  +  m 
[Qi ,  Preds{{A,  Xj,  yj  \  j  ^  i})] . 
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where  the  required  invariance  of  m,'  translates  into  the  invariance  of  y,-.  We 
conclude  the  derivation  by  building  up  the  remaining  context. 

[it,  {x  =  search{A)}  U  {P(yi[t;]),  |  1  <  <  n}] 

C5  Ce  PAR-N,  SEQ.  NEW(yi.. . 

[ar  =  search(A)  ^  Preds{ili)] , 

Locality  of  yi  through  ym  and  xi  and  Xtn  ensures  preservation  of  Qi  for  all 
z,  and  m  <  n  for  all  n,  and  for  all  predicates  in  F,*  except  for  P(^[t;])  and 
Note  that  the  invariance  of  search(A)  is  implied  by  the  preservation 
of  P(A[t;])  and  -iP{A[v])  for  all  v. 

Putting  it  all  together 
Thus,  by  transitivity 

[tt,  {x  =  search{A)}  U  {P(>l[v]),  -*P{A[v])  |  1  <  v  <  n}] 

Cl  >-  Cs  Lemma  5.3 

[x  =  search{A)  y  Pr€ds{9)] . 


Discussion 

1.  Just  as  in  the  previous  examples  it  is  interesting  to  observe  the  precise 
requirements  the  implementation  places  on  its  parallel  environment.  In 
this  case,  it  is  admissible,  for  instance,  to  change  the  contents  of  A  as  long 
as  the  value  of  P{A[i])  remains  unchanged  for  all  i. 

2.  On  first  glance,  the  maximum  search  problem  of  Section  6.3  is  an  in¬ 
stance  of  the  array  search  problem  discussed  in  this  section.  However,  the 
solutions  to  both  problems  differ  substantially  in  their  degree  of  paral¬ 
lelism.  The  reason  is  that  the  function  search(A)  looks  for  the  first  index 
of  A  that  satisfies  P  whereas  in  the  maximum  search  problem  any  index 
pointing  to  the  maximal  value  suffices.  This  requirement  for  $eai*ch{A) 
does  not  lend  itself  to  a  parallel  implementation.  Consider  program  C3  in 
Figure  6.8.  The  parallel  process  i  cannot  be  refined  into  a  parallel  compo¬ 
sition  analogous  to  the  maximum  search,  because  it  is  required  to  preserve 
Qi  which  requires  non-local  knowledge.  Note  that  program  Ce  and  the 
alternative,  less  parallel  implementation  C"'  of  Section  6.3  have  similar 
structure.  Indeed,  C7'  finds  the  smallest  index  ki  that  witnesses  that  A[i] 
is  not  maximal. 


6.4.1  Alternative  refinements 

In  the  previous  refinement  sequence  y,-  was  used  to  find  the  minimal  index  of 
partition  i  pointing  to  an  entry  satisfying  P  which  clearly  implies  search{A)  = 
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C4  =  new  2/1  =  n  +  1, . . .  ,ym  n  +  m  in 
II m  new  Xi  :=  i  in 

lli=i  {xi,yi}-.[U,Xi  >Xi  Ayi  <yi  Apre  Qi]*  ;  {Q,}  J  ’ 

x:=min{yi,...,ym} 

C5  =  new  2/1  =  n  A  1, ...  ,ym  =  n  +  m  in 
new  Xi  =  i  in 

""  ({ar,'  <  min{yi,.  ..,ym)  Aj); 

,_i  {xi,yi}:[tt,  Xi  >Xi  Ayi  <yi  Apre  Q,])*;  ' 

{Qi}  J 

x:=min{yi,...,y„i} 
where  I  =  VI  <  f  <  n.Ii 

Cq  =  new  yi  =  n  +  I, ,ym  =  n  m  in 
^  new  Xi  =  i  in 

while  Xi  <  min{yi , . , . ,  ym]  do  ; 

*-i  P[A[xi])  then  yi  :=Xi  else  Xi  :=Xi  +  m 


Figure  6.9:  Derivation  of  the  second  solution  to  the  array  search  problem 


min{yi, . . . ,  y^}.  We  will  now  look  at  implementations  that  achieve  this  equa¬ 
tion  without  yi  necessarily  pointing  to  the  minimal  entry  in  partition  i. 

More  efficiency  through  earlier  termination 

An  alternative  implementation  with  improved  best  case  behaviour  can  be  found 
by  realizing  that  not  every  yi  has  to  point  to  the  first  entry  in  partition  i 
satisfying  P  for  search{A)  =  mm{yi, . .  .,yn}  to  hold.  The  search  for  such  an 
entry  in  partition  i  can  safely  be  aborted  as  soon  it  is  known  that  its  index  would 
be  greater  than  the  first  such  index  in  some  other  partition.  In  other  words, 
the  loop  condition  can  be  strengthened  from  Xi  <  yi  to  Xi  <  rnin{yi, . .  .,ym} 
leading  early  termination  in  some  cases.  The  alternative  refinement  of  C4  based 
on  this  idea  is  summarized  in  Figure  6.9. 


Refining  C4  into 
Let 


B  =  Xi  <  min{yi, . .  .,ym} 
I  =  VI  <  z  <  nJi 
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where  !{  is  as  above.  The  proof  of  refinement  between  C4  and  is  similar  to 
the  one  for  Cs  and  omitted. 

Refining  C5  into 
The  proof  of 

[ttj  {x  =  search{A)]  U  {P{A[v])^  |  r  €  N] 

C'.yCe 

\x  —  search[A)^Preds{^)] 
has  the  same  structure  as  that  of 

[tt^  {x  =  search(A)}  U  -^P{A[v])  1 1;  G  N] 

Cs  >“  Cq 

[x  =  search{A)^  Preds(0)] . 

However,  the  stronger  loop  condition  Xi  <  . .  .,2/m}  results  in  the  fol¬ 

lowing  changes: 

•  The  loop  invariant  must  be  strengthened  from  /,•  to  I.  This  means  that 
I  must  be  preserved  rather  than  /,♦ .  Note  that  every  process  j  with  j  ^  i 
trivially  preserves  /,•  because  it  cannot  change  Xi  or  y,-.  Moreover,  it  also 
preserves  Ij.  Thus,  the  environment  of  i  preserves  Ij  for  all  j,  and  thus 
also  I . 

•  The  proof  that  Qi  holds  upon  termination  of  the  loop  is  given  by 

Xi  >  min{yi , . . . ,  y^]  A  /  =>  Qi, 

•  The  measure  must  be  adapted  to 

__  r  min{yi,,,,,ym}  -  Xi,  if  mm{yi , . . . ,  ym}  > 

^  0,  otherwise. 

Search  A  in  the  opposite  direction 

C3  also  could  have  been  refined  by  searching  the  array  in  the  direction  of  de¬ 
creasing  rather  than  increasing  indices.  The  resulting  program  Cq  is  shown  in 
Figure  6.10.  Note  that  now  y,-  might  be  updated  more  than  once  before  the 
minimal  index  is  found.  This  is  in  contrast  to  Ce  and  Cq,  which  update  each  y,- 
at  most  once.  Also,  Cq  does  not  allow  an  early  exit  out  of  the  loop. 

All  refinements  of  this  section  are  summarized  in  Figure  6.11. 
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Figure  6.10:  Derivation  of  the  third  solution  to  the  array  search  problem 
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Figure  6.11:  Overview  of  solutions  to  the  array  search  problem 


Chapter  7 

Developing  distributed 
programs 


The  examples  of  the  previous  chapter  were  based  on  the  assumption  of  a  memory 
that  is  shared  between  parallel  processes.  Parallel  processes  share  information 
by  simply  reading  from  and  writing  to  this  shared  memory.  This  section  will 
demonstrate  the  formal  development  of  programs  in  which  parallel  processes 
communicate  solely  through  message  passing.  The  development  of  the  examples 
in  this  chapter  proceeds  as  follows. 

1.  First,  a  shared- variable  implementation  is  derived  from  the  initial  specifi¬ 
cation  using  the  exact  same  techniques  as  in  the  previous  chapter. 

2.  Then,  for  each  parallel  process  we  determine  which  part  of  the  state  is 
maintained  and  thus  logically  ^^owned”  by  that  process.  This  part  of  the 
state  is  called  local  The  remaining  part  is  called  non-local.  We  assume 
that  every  process  has  direct  memory  access  to  its  local  information. 

3.  For  every  piece  of  non-local  information  that  needs  to  flow  into  a  parallel 
process  a  channel  is  introduced  connecting  producer  and  consumer  of  the 
information.  Semantically,  channels  are  local  variables  ranging  over  finite 
queues. 

4.  A  sequence  of  refinement  steps  gradually  ensures  that 

•  every  parallel  process  makes  the  part  of  its  local  information  required 
by  other  processes  available  through  these  channels,  and  that 

•  every  parallel  process  satisfies  its  needs  for  non-local  information  by 
accessing  channels  rather  than  shared-memory. 

5.  The  refinement  stops  when  every  process  obtains  its  non-local  information 
through  channels. 
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The  above  decomposition  of  the  development  is  a  matter  of  convenience, 
not  technical  necessity.  In  other  words,  distributed  programs  could  also  be 
developed  directly  from  the  initial  specification.  However,  the  use  of  a  shared- 
variable  implementation  as  an  intermediate  stepping  stone  allows  us  to  separate 
the  introduction  of  parallelism  and  the  introduction  of  channels  and  distribution. 
Note  that  some  algorithms  may  not  allow  this  kind  of  separation. 

Moreover,  in  principle,  a  shared- variable  program  could  also  be  derived  from 
a  distributed  program.  However,  the  use  of  distributed  programs  as  intermedi¬ 
ate  stepping  stones  appears  less  useful,  because  of  the  overhead  involved  with 
introducing  channels  and  maintaining  their  contents. 


7.1  Example:  Prefix  sum 

A  prefix  computation  is  defined  in  terms  of  a  binary,  associative  operator  ®. 
Given  a  sequence  of  values  V2i  • « *  j  as  input,  the  prefix  computation 

produces  the  output  {yi ,  2/2?  ♦  •  •  >  yn)  where 

yi  = 

Vi  =  2/1-1  =  vi0V20--*<S)v,* 

for  2  <  2  <  n.  Suppose  we  have  a  collection  of  n  objects  where  each  object  i 
has  fields  and  pr€v[i].  We  assume  that  each  value  is  stored  in  some  object 
and  that  the  objects  form  a  singly  linked  list  L  More  precisely,  for  object  f,  let 
prev[i]  =  j  if  j  is  the  predecessor  of  i  in  /  and  prev[i]  —  nil  if  i  is  the  head  of  the 
list.  Furthermore,  assume  that  the  assignment  of  the  input  values  to  objects 
does  not  follow  the  layout  of  the  objects  in  memory,  but  rather  the  layout  of 
the  objects  in  the  list.  In  other  words,  we  do  not  necessarily  have  x[i[  —  Vj,  but 
rather  x[i]  =  vj  if  i  is  the  jih  object  from  the  beginning  of  L  For  illustration, 
see  Figure  7.1  below. 

Prefix  computation  is  a  central  operation  in  the  design  of  parallel  algorithms, 
because  many  problems  on  lists  and  trees  can  be  reduced  to  it.  For  instance, 
determining  the  distance  of  each  element  to  the  end  of  the  list,  also  called  list 
ranking,  can  be  solved  by  choosing  the  value  x[i]  at  each  object  i  to  be  1  and  the 
operator  to  be  addition.  Another  example  for  a  common  parallel  algorithm  that 
can  be  reduced  to  a  prefix  computation  is  the  Euler  tour  technique  to  compute 
the  depth  of  each  node  in  a  binary  tree  [CLR90]. 

7. 1.1  Deriving  a  shared- variable  implementation 

Let  each  of  the  numbers  1  through  n  each  stand  for  an  object.  The  predicate 
Pi  expresses  that  if  i  is  the  jih  object  in  list  Z,  then  i  carries  value  vj  in  x[i]  and 
prev[i]  correctly  points  to  the  predecessor  of  i  in  L  Formally, 

Pi  =  x[i]  =  Vj  Aprev[{\  =  previ{i) 


prev[oi] 


03 


oi 


nil 


02 


x[Oi]  V2  V3  Vi  V4 


Vo,  \  Vi^V2  Vi(^V2<S>  V3  Vi  Vi0V2<^V3<S>  V4 


Figure  7.1:  Illustration  of  the  input  to  the  prefix  sum  algorithm  for  n  =  4 


Q  requires  every  object  i  to  contain  its  corresponding  prefix  sum  in  x[i].  The 
first  four  refinement  steps  are  summarized  in  Figure  7,2.  We  start  with  the  very 
abstract  and  obviously  correct  program  Ci .  It  allows  x  and  prev  to  be  changed 
arbitrarily  an  arbitrary  but  finite  number  of  times  before  terminating  in  a  state 
satisfying  Q. 
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Cl 

{py 

{x,prev]:[U,ttY  ’,{Q} 

where  P 

VI  <  2  <  n.x[i]  =  Vj  Aprev[i]  =  previ{i) 

Q 

= 

VI  <  2  <  n.a?[2]  =  [1, 2] 

C2 

= 

{py 

({S  A  7}  ;  {x,prev):[tt,tt\* ;  {7})*  ;  {Q} 

where  B 

31  <  2  <  n.prev[i]  ^  nil 

I 

VI  <  2  <  n.Ii 

li 

= 

(g)(i)  =  [l.i] 

Cs  =  {Ph 

{{B  A  7}  ;  [||"=i{a;[i],pret;[i]};[«,  /,•  A  Vi.pre  /,]])* ;  {Q} 


C.  = 


{py, 

(  {5A/}; 

{Q] 


in  if  prev[i]  ^  nil  then 

l*=i  x[i],prev[i]  :=x[prev[i]]  0  x[i]^prev^[i] 


a  =  {P}; 

while  3i,prev[i]  /  nil  do 
II n  if  prev[i]  ^  nil  then 

x[i]^prev[i]  i=x[prev[i]]  0  x[i]^prev^[i] 


Figure  7.2:  Derivation  of  a  shared-variable  solution  to  the  prefix  sum  problem 


Refining  Ci  into  C2 

Obviously,  the  computation  of  Q  requires  a  number  of  steps.  In  many  of  the 
previous  examples,  the  postcondition  or  some  intermediate  predicate  could  be 
achieved  through  parallel  computation.  Each  parallel  process  computes  a  part 
of  the  solution  while  respecting  the  computation  of  the  other  processes.  There 
is  no  obvious  way  in  which  the  computation  of  Q  can  be  parallelized  in  this 
way.  Instead,  we  opt  to  prepare  for  the  refinement  of  the  finite  loop  in  Ci  into  a 
while  loop.  We  thus  need  to  identify  a  loop  condition  B  and  a  loop  invariant  /. 
The  loop  should  terminate  at  least  when  Q  holds.  Thus,  we  let  B  =  Also, 
let  0(0  <^enote  the  product  of  the  values  in  i  and  all  its  predecessors,  that  is, 


(8»(0  = 


0(prer;[2])  0  ar[i], 
a:[e], 


if  prev[i]  ^  nil 
otherwise. 
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Given  an  object  i  that  is  the  jth  object  in  the  list,  the  invariant  L  claims  that 
following  the  prev  pointers  towards  the  beginning  of  the  list  and  “multiplying” 
the  values  stored  in  each  of  the  encountered  objects,  yields  [1,  j],  the  prefix  sum 
to  be  stored  at  i.  Formally, 


li  =  0(i)  =  if  *  is  jth  object 

I  B  VI  <  i  <  n.Ii. 

We  now  have  the  loop  condition  and  the  invariant.  Assuming  the  invariant, 
the  loop  condition  can  be  simplified.  More  concretely,  if  I  holds,  then  an  object 
i  that  is  jth  in  the  list  carries  the  final  value  [1,  j]  if  and  only  if  prev[i]  =  nil. 
Formally, 

/  =>  (-nQ  ^  B) 

where  B  =  31  <  i  <  n.prev[i]  ^  nil.  We  prefer  to  use  B  over  ~iQ,  because  it  is 
easier  to  check.  Moreover,  we  need  to  allow  for  more  than  just  a  single  update 
of  X  and  prev  in  the  loop  body.  Formally,  we  derive 


{xjprev}:[ti,tt]* 

=rt  l{x,prev}:[tt,tt]*)* 

Lemma  2.1 

=jt  (skip ;  {x,prev}:\tt,tt\* ;  skip) 

Lemma  2.1 

{{B  Al}  {x,prev]:[U,U]*  {I})* 

Lemma  2.2 

which  implies  Ci  ^^2  by  congruence  (Lemma  2.2). 

Refining  C2  into  C3 

The  loop  body  has  to  be  fully  developed,  before  the  finite  loop  can  be  refined  into 
a  while  loop.  This  is  because  of  the  termination  requirement  in  rule  WHILE- 
INTRO.  The  loop  body  has  to  maintain  the  invariant  I  while  making  progress 
towards  a  solution,  that  is,  towards  falsifying  B.  While  the  computation  of  Q 
could  not  be  parallelized,  the  loop  body  can  be.  Each  object  i  is  assigned  to  a 
parallel  process  that  is  responsible  for  updating  x[i]  and  prev[i]  in  such  a  way 
that  li  is  achieved  upon  termination  and  the  invariants  Ij  of  the  other  processes 
j  are  preserved.  More  concretely, 

{xjprev}:[tt,UY  ;  {1} 


is  refined  into 

\\^=i{x[{\,prev[{]}:[ttJiA\fi.pre  /*]. 

Before  PAR- INTRO  can  be  applied,  the  preservation  requirement  must  be  added. 
TZi  =  {x,prev}:[it,tt]* 

Crt  {x,prev]:[it,^i.pre  !{]*  Lemma  2.2 

We  check  the  premises  of  PAR- INTRO. 
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1.  Program  {Xyprev}:[tij\fi.pre  /,]  is  atomic  and  thus  robust  (Proposition  2.1), 
It  can  also  readily  be  seen  that  it  preserves  {/}. 

2.  Process  i  is  refined  as  follows. 

{x,prev}:[ti^  "ii.pre  /,] 

>- 

{x[i],prev[i]}:[iij  U  A  Wi.pre  /,]  (Lemma  2.2) 

UAi}] 

for  all  1  <  2  <  n.  Note  that  the  fact  that  process  i  only  modifies  a:[i]  and 
prev[i]  does  not  imply  that  it  also  preserves  Ij  for  all  j  ^  i. 

3.  From  the  above  refinement,  it  can  easily  be  seen  that  the  assumptions  and 
guarantees  of  the  processes  fit  together.  Every  process  preserves  7,  if  it  is 
being  preserved  by  its  environment. 

4.  The  postconditions  of  each  process  imply  the  overall  postcondition,  that 
is,  (VI  <  i  <  n.Ii)  =>  7. 

Thus,  by  PAR-IIMTRO, 

n2  =  [7,  {7}] 

{x,prev}:[U,U]*  ;{I} 

y  2rt(^i) 

{x^prev}:[ttj'ii,pre  !{]*  ;  {7} 

y  PAR-INTRO(l,2,3,4),WEAK 

||JLi{x[f],prev[i]}:[^<, 7,-  A'ii.pre  7,] 

[7,  Preds(0)] . 

Wrapping  the  remaining  context  around,  yields 

[P,{P,I,Q)]  C2  y  Cz  [Q,  Preds(0)] .  SEQ(7e2),  star,  seq 


Refining  C3  into 

Each  of  the  parallel  processes 

{x[i]^prev[i]]'\tt^  U  A  Vi.pre  7,] 

is  refined  now.  Since  7  is  the  loop  invariant  and  7,*  is  preserved  by  all  other 
processes,  process  i  can  assume  7,-  before  execution.  Thus,  skip  would  be  a 
correct  refinement.  However,  no  progress  towards  the  solution  would  be  made, 
and  termination  could  thus  never  be  shown.  Alternatively,  we  will  employ  a 
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technique  commonly  called  pointer  jumping  [CLR90].  Let  j  be  the  predecessor 
of  i.  The  predecessor  of  i  is  set  to  the  predecessor  of  j  while  replacing  its  value 
with  the  product  of  its  current  value  and  the  value  of  j.  More  precisely,  each 
process  i 


Di  =  {x[i],prev[i]]:[tt,  U  hMi.pre  R] 


is  replaced  by 

D\  “  if  prev[i\  ^  nil  then  x[i]j prev[i]  :-x\pr€v[i]]  (g)  x[i]jprev^[i] 


where  prev^  [i]  short  for 


prev 


2 


prev\prev[i]]^ 

nilj 


if  prev[i]  ^  nil 
otherwise. 


Pointer  jumping  maintains  the  invariant  and  as  we  will  see  in  the  next  refinement 
from  C4  to  Cs  it  also  means  progress  towards  a  solution.  Formally, 


R'3,i  = 

Di 


{x[i]^prev[i]}:[Uj  R  A  "ii.pre  U] 

>-  D.j-4; (Lemma  2.2),  Lemma  5.4 

skip  ;  {x[i\,prev{i\]'\ti^  L  A  "ii.pre  R] 

Vskip  ;  {x[i],prev[i]}:[tt^  !{  A  ^i.pre  R] 

>*-  (Lemma  2.2),  Lemma  5.4 

{prev[i]  /  nil}  ;  {x[i],prev[i]}:[tt^  R  A  \li.pre  R] 

\/{prev[i]  ~  nil}  ;  {/*} 

if  prev[i]  ^  nil  then  {x[i],prev[i]}:[tt,  U  A\/i.pre  R] 
else  {li} 

y  ATOM, COND 

if  prev[i]  ^  nil  then  x[i]j prev[i]  :=x[prev[i]]  0  x[i], prev^[i] 

else  skip 

D'i 

where 

Ti  =  {I^prev[i]  ^  nil,pr€v[i]  =  nil} 

Ai  =  {I^prev[j]  ^  nil,prev[j]  —  nil  |  j  ^  z}. 

Refinement  for  the  loop  body  is  obtained  by  putting  all  processes  in  parallel. 
Note  that  the  processes  respect  each  other’s  requirements,  that  is,  Ti  C  Aj  for 
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all  j  ^  i. 

=  [/,Ur.]  Il?=i  Ax  [/,{/}].  par-v-n(^3,J 

Note  that  P|.  A,-  =  {/}.  Refinement  between  C3  and  C4  follows. 
[P,{P,I,Q}UPT^ds({prev})] 

C3  y  Ci  STAR,  SEQ 

[Q,  Rreds(0)] . 


Refining  C4  into  C5 

The  loop  body  is  now  fully  refined  and  the  finite  loop  can  now  be  replaced  by 
a  while  loop  using  WHILE-INTRO.  We  check  the  premises  of  the  rule. 

1 .  The  loop  body  does  not  need  to  be  refined  any  further.  To  satisfy  the  first 
premise,  refinement 

[SA/,ur.]  IILi  A  X  ||?=iA-  [/.{/}] 

is  obtained  from  II4  and  strengthening  of  the  precondition. 

2.  The  negation  of  the  loop  condition  and  the  invariant  imply  the  solution, 

(^i.pr€v[i]  =  nil  A  I)  ^  Q. 

3.  The  loop  termination  proof  is  a  little  more  difficult  than  in  the  previous 
examples.  As  before,  we  find  an  arithmetic  expression  m  that  serves  as 
a  variant.  For  each  object  i,  the  measure  m,-  records  the  “distance”  of  i 
from  the  beginning  of  the  list. 

/  »^prev[.]  +  1.  if  prev[{\  nil 

’  \  0,  otherwise 

m  =  m,-. 

Note  that  m  >  0  by  definition  and  m  =  0  'ii.prev\t\  =  nil,  that  is, 
m  =  0  =>  Moreover,  if  prev  does  not  contain  any  cycles,  then  m  is 
finite.  Finally,  we  need  to  determine  if  and  under  what  conditions  the  loop 
body  always  decreases  the  measure.  The  loop  condition  B  expresses  that 
at  least  one  entry  in  prev  is  not  nil.  Without  loss  of  generality,  let  that 
index  by  Ar,  that  is,  pr€v[k]  ^  nil.  Consequently,  assuming  that  the  values 
of  prev  and  thus  prev[k]  ^  nil  are  preserved,  ||”=i(w  can  be  simplified  by 
replacing  the  conditional  in  k  by  its  then-branch.  Formally, 

7^5  =  [B,Preds{{prev})] 

x[k]^ prev[k]  :s:x\prev[k]]  <g)  x[k]^prev^[k]  II  [ll?=i,i>ifcAl 
[it,  Pfieds(0)]. 
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Assuming  prev  does  not.  contain  any  cycles,  the  pointer  jump  must  bring 
k  closer  to  the  beginning  of  the  sublist  that  it  is  part  of.  Its  distance  to 
the  head  of  the  sublist  decreases  and  thus  m  decreases. 

[ac{prev)^  {ac{prev)}'\ 

Am 

>-  ATOM 

x[k],prev[k]  :-x\prev[k]]  0  x[k],pr€v^[k] 

[ac{prev),  {ac{prev)}] 

where  ac{prev)  denotes  that  prev  is  free  of  cycles. 

Each  D-  with  i  k  either  decreases  m  or  leaves  it  unchanged.  It  also  does 
not  introduce  cycles.  Formally, 

[ac{prev),  {ac{prev)}'\ 

[inv  m  ;  Am)  V  inv  m  y  D[ 

\ac(prev)^  {ac{prev)}] 

for  all  i  with  i  ^  k.  Consequently, 

IZe  =  [ac{prev)^{ac{pr€v)}] 

{inv*m  ;  Am  ;  inv*m)'^ 

y 

x[k],prev[k]:zzx[prev[k]]<S)  x[k],prev^[k]  || 

[ac{prev)j  {ac{prev)}] 

can  be  shown  by  induction  over  n.  Thus,  the  loop  body  decreases  m  as 
required. 


[B  a  ac{prev),  Preds{{prev})] 

{inv*m;Am  ;inv*m)'^  y  Lemma 5. 3(7^5, 7^6) 

[ac{prev),  Preds{^)] . 

Informally,  the  loop  body  reduces  the  measure,  if  the  loop  condition 
B  is  true,  prev  is  acyclic,  and  prev  is  not  changed  by  the  environment. 
Since  prev  is  initialized  as  a  list  it  must  be  acyclic  initially.  Since  it 
is  preserved,  ac{prev)  is  part  of  the  invariant.  In  contrast  to  the  other 
examples,  the  termination  proof  is  context-sensitive.  Process  k  will  only 
decrease  the  measure  if  prev[k]  ^  nil  and  ac[prev)  are  preserved.  Note 
that  this  context-sensitivity  requires  a  slight  adaptation  of  rule  WHILE- 
IIMTRO.  The  assumptions  needed  to  prove  that  the  body  decreases  the 
measure  need  to  be  added  to  the  assumptions  of  the  overall  refinement. 
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All  obligations  of  rule  WHILE-INTRO  are  satisfied  and  we  obtain 
[/,  {/,  Q)  U  Pre<is({prev})] 

({BA/};[  ])*;{<?) 

WHILE-INTRO(l,2.3) 

while  B  do 

[Q,  Preds(0)] 


and 


[P,  {P,  Ij  Q}  U  Frecls({prev})] 

Cs  >-  C4  SEQ 

[Q,Preds{0)], 

Note  that  C5  coincides  with  the  algorithm  given  in  [CLR90].  The  use  of 
a  multiple  assignment  in  program  Cs  prevents  the  environment  from  accessing 
one  of  the  entries  in  x  and  prev  in  inconsistent  intermediate  states.  It  thus 
is  crucial  for  correctness.  On  the  other  hand,  it  also  restricts  parallelism  and 
is  thus  undesirable.  In  Section  7.1.3  we  will  see  how  further  refinements  allow 
us  to  increase  parallelism  by  replacing  the  multiple  assignment  by  two  single 
assignments. 

7,1,2  Deriving  a  distributed  implementation 

Using  program  C5  as  a  starting  point,  we  would  like  to  derive  a  distributed 
implementation.  More  precisely,  we  want  object  i  to  access  only  the  variables 
v[i]  and  prev[i]  directly  while  the  values  of  other  variables  that  i  depends  upon 
are  communicated  to  i  by  message  passing.  Figure  7.3  summarizes  the  derivation 
of  a  distributed  implementation. 

Refining  C5  into  Cq 

This  refinement  step  introduces  local  channels  c[l]  through  c[n].  After  each 
iteration,  object  i  will  use  channel  c[z]  to  communicate  the  updated  values  of 
prev[i]  and  x[i]  to  the  object  that  it  is  the  successor  of,  that  is,  to  the  object  j 
with  prev[j]  =  i.  The  formal  proof  of  this  refinement  step  is  based  on 

[/  A  inj{prev)  A  P',  F] 

[/  A  inj(prev)  A  Q',  {/}] 


ll<=lA'  >“{c} 


(7.1) 
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C5  =  {P}; 

while  3i.prev[i]  ^  nil  do 
II n  if  prev[i]  ^  nil  then 

x[i]jprev[i]:-x\prev[i]]  0  xltl^prev^^li] 


Ce  =  {Ph 

new  c[l]  =  {{prev[l],  a^[l])),  • . c[n]  ((pret;[n],  aj[n]))  in 
while  3i.prev[i]  ^  nil  do 

^  if  prev[i]  ^  nil  then 
new  p  =  X  =  0  in 
c\prev[i]]7{p,  x);  ’ 

x[i]jprev[i]:=x  x[i],p 
I  ”=ic[i]!(pret;[i],a:[i]) 


cr  =  {py, 

new  c[l]  =  ((prev[l],  a:[l])), . . . , 

c[n]  =  {{prev[n]jx[n]))^d  =  (it)  in 
while  d  do 
empty{d); 

^  if  prev[i]  ^  nil  then 
new  77  =  a:  =  0  in 
c[prev[i]]?{p,x);  ’ 

x[i],prev[i]:=x  <S>  x[i]jP 

I  II  if  pxev[i]  nil  then  dlit] 

where  empty[d)  =  new  a?  =  0  in  while  d  ^  ()  do  dlx 


Figure  7.3:  Derivation  of  a  distributed  solution  to  the  prefix  sum  problem 


where 

Di  =  if  prev[i\  ^  nil  then 

x[i\,prev[i]  :=x\prev[i]]  0  x[i]^  prev'^li] 

=  if  prev[i\  ^  nil  then 
new  p  =  ar  0  in 
c[prev[i]]?{p^  x) ; 

X  [i] ,  prev[i]  :=x  x  [z] ,  p 

and 

inj{prev)  =  prev  is  injective,  that  is, 

VI  <  i,j  <  n.i  7^  j  {prev[i]  ^  prev[j]  \/  prev[i]  =  prev[j]  ~  nil) 
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P'  =  VI  <  2  <  n,{3j.prev[j]  =  i)  =>  c[i]  =  ((pre2;[2],  ^[f])) 
=  VI  <  2  <  n,{3j.prev[j]  =  i)  =>  ^[i]  =  () 
r  =  Preds({prev[i]jX[i]^c[i]  |  1  <  2  <  n}). 


P'  expresses  that  if  i  is  the  predecessor  of  some  other  process,  then  the  channel 
c[2]  will  contain  prev[i]  and  a; [2].  Q'  expresses  that  if  i  is  the  predecessor  of  some 
other  process,  then  the  channel  c[2]  will  be  empty.  The  proof  of  this  refinement 
is  elegant  but  lengthy  and  thus  postponed  to  Section  A.3.1.  Moreover,  we  can 
show 


[/  A  inj{prev)  A  Q*,  {/,  2nj(pre?;),  Q',  P'}] 
skip 

skip”  (Lemma  2.1),  Lemma  5.4 

y^c]  ATOM.  PAR-N 

||JLic[i]!(pret;[i],a:[i]) 

[/  A  inj{prev)  A  P',  Freds(0)]. 

Thus,  the  loop  invariant  1  used  in  the  derivation  of  the  shared- variable  imple¬ 
mentation  together  with  the  injectivity  of  prev  and  Q',  form  the  invariants  of 
the  modified  while  loop  in  C5. 

Preds{{x,prev})]  C4  y  C5  [Q,  Preds{9)] 

follows  with  WHILE,  NEW-INTRO,  and  SEQ. 

Refining  Cq  into  C7 

The  loop  condition  in  Cq  is  both  rather  complex  and  also  requires  access  to  the 
entire  prev  array.  This  refinement  step  will  simplify  the  implementation  of  the 
loop  condition.  Another  channel  d  is  introduced  such  that  we  have  d  ^  ()  if 
and  only  if  32.prei;[2]  /  nil  at  the  beginning  of  each  iteration.  More  formally, 
we  show 


[d  =  (),  Preds{{prev,  d})] 

>-{d} 

||"_i [c[i] !(pret)[i],  x[i])  ||  if  prev[i\  ^  nil  then  d\tt] 
[Q ,  Preds{9)] 
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where  Q  =  d  =  {)  <^  'ii.prev[i]  =  nil.  This  refinement  is  obtained  from 

[R,{R,Si)] 

c[i]\{prev[i]^x[i]) 

y{d] 

[c[^]!(pre^;[^],  a:[i])  ||  if  prev[i]  ^  nil  then  d\ti] 
[RASi,{R,Sj  I  j  i]] 

where 


R  =  (VI  <  i  <  n.prev[i]  =  nil)  d  —  () 

Si  =  prev[i]  nil  7^  d  ^ 

using  PAR-N,  and  weakening  together  with  the  fact  that  d  =:  {)  ^  R  and 
RA\fi.Si  =>  (5-  Predicate  Q  can  be  shown  to  be  part  of  the  loop  invariant  of  the 
while  loop  in  C?.  Assuming  the  invariant,  the  old  loop  condition  3i.prev[i]  ^  nil 
of  Ce  and  the  new  loop  condition  d  of  C7  are  equivalent,  that  is, 

Q  =>  {3i.prev[i]  ^  nil  d  ^  ()). 

Thus,  one  can  be  replaced  by  the  other  using  WHILE. 

7.1,3  Increasing  parallelism 

We  would  like  to  derive  a  distributed  implementation  that  also  exhibits  more 
fine-grained  parallelism. 

Inspection  of  program  C7  reveals  that  the  distributed  implementation  obvi¬ 
ates  the  need  to  update  the  variables  x[i]  and  prev[i]  simultaneously,  because 
no  other  process  mentions  x[i]  and  prev[i]  anymore.  Consequently,  no  other 
process  can  access  x[i]  or  prev[i]  while  they  are  being  assigned  to.  Let  Cs  be 

Cs  =  new  c[l]  =  ((pret;[l],x[l])), 

c[n]  {{prev[n]^x[n]))^d=^  {tt)  in 
while  d  ^  {)  do 
empty  {d)'^ 

^  if  prev[i]  7^  nil  then 
new  p  =  ic  =  0  in 
c\prev[i]]?{pyx)\  ’ 

•=^  C)  II  prev[i]  :=p] 

I  II  i^Pf'ev[i]  ^  0  then  d\tt]. 

However,  while  Cs  constitutes  a  correct  implementation  in  terms  of  its  input- 
output  behaviour,  it  unfortunately  is  not  a  refinement  of  C7.  Intuitively,  this 
is  because  Cs  allows  ar[z]  and  prev[i]  to  be  updated  in  succession,  whereas  C7 
always  updates  these  variables  simultaneously.  However,  if  changes  to  one  of 
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the  arrays  are  made  invisible,  that  is,  if  one  of  them  is  declared  local,  then  C7 
and  Cs  exhibit  equivalent  behaviour. 

Let  dec  stand  for  the  following  list  of  declarations 

dec  =  prev[\]  =  pi , . . . ,  prev[n\  —  pn 

where  pi  be  the  predecessor  of  object  i  in  list  L  Using  Lemma  2.1  we  can  thus 
deduce 

new  dec  in  C7  new  dec  in  Q. 

With  the  previous  refinements  this  implies 

new  dec  in  C,-  new  dec  in  Cs 

for  all  I  <  i  <7. 

Note  that  the  amount  of  parallelism  in  C5  could  also  have  been  increased 
within  the  shared- variable  paradigm  and  without  introducing  channels  and  mes¬ 
sage  passing.  Program  Cq  shown  below  represents  an  alternative  refinement 
of  C5  that  achieves  more  parallelism  by  introducing  local  variables  a;'[z]  and 
prev^[i].  These  variables  serve  as  local  copies  of  x[i]  and  prev[i]  respectively, 

Cq  =  new  pret;'[l]  =  prev[l],x'[l]  =  a?[l], . . ., 

prev^[n]  =  prev[n]^  x'[n]  =  x[n]  in 
while  3i.prev[i]  ^  nil  do 

^  if  prev[i]  ^  nil  then 

1=1  :=x[prev[i]]  x[2]  ||  prev^[i]  :=prev^[i] 

I  f-i\prev[i]i=prev^[i]  ||  x[z]:sx'[f]] 

We  conclude  this  example  with  an  overview  of  all  refinements  performed  in 
this  section  in  Figure  7.4. 
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Figure  7.4:  Overview  of  solutions  to  the  prefix  sum  problem 
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7.2  Example:  All-pair  shortest-paths 

Given  an  unweighted  graph  G  =  {V,  E),  the  goal  is  to  solve  the  alhpair  shortest- 
paths  problem,  that  is,  to  compute  dist{vi,Vj)^  the  length  of  the  shortest  path 
between  any  two  vertices  Vi,Vj  G  V.  The  length  of  a  path  is  given  by  the 
number  of  vertices  it  contains  minus  1.  G  may  be  directed  or  undirected.  Let 
n  be  the  number  of  vertices  in  G,  that  is,  n  =  |V^|.  The  shortest  distances  are 
to  be  stored  in  a  two-dimensional  array  D.  The  initial  program  Ci  allows  an 
arbitrary  but  finite  number  of  updates  to  the  array  D  before  the  final  state 
satisfying  Q  =  "ivi.Vj  £  V,D[vijVj]  =  dist{vi,Vj)  is  established.  That  is, 

Cl  =  D:[it,ttY-,{Q}. 

We  will  begin  by  deriving  a  shared- variable  implementation  for  this  problem. 
Then,  we  will  attempt  to  further  refine  this  solution  into  a  distributed  imple¬ 
mentation.  Our  first  derivation  is  summarized  in  Figure  7.5. 

Refining  Gi  into  C4 

The  first  refinement  C2  suggests  to  consider  sequentially  the  vertices  reachable 
from  some  vertex  v  in  the  order  of  increasing  distance.  More  precisely,  first  we 
consider  the  vertices  that  can  be  reached  from  v  via  paths  of  length  1 ,  then  via 
paths  of  length  2,  and  so  on,  until  we  have  considered  paths  of  length  n  —  1. 

Refinement  C3  introduces  the  concept  of  a  fringe.  The  fringe  of  a  vertex  v 
with  distance  Ar,  fring€{k,v)  for  short,  is  defined  to  be  the  set  of  vertices  that 
are  reachable  from  v  through  paths  of  length  k.  Formally, 

fring€{k^v)  =  {u'  |  dist{VyV^)  =  k}. 

If  X  =  fring€{k^  i;)  we  say  that  X  is  the  Ar-fringe  of  v.  A  local  variable  F[k,v] 
that  holds  the  A:-fringe  of  v  is  introduced.  In  each  iteration  i,  the  computation 
of  the  current  fringe  F[k^v]  is  obtained  by  considering  the  immediate  neighbours 
of  all  vertices  in  the  k  -  1-fringe  of  v.  Formally,  we  have  the  property 

fringe(kj  v)  =  |  {v\  u")  £  E  A  D[v,  v”]  =  nit).  (7.2) 

Note  that  while  C2  already  breaks  the  problem  down  into  a  sequence  of  n  —  1 
subproblems  which  are  then  solved  sequentially,  it  is  not  the  case  that  C2  obtains 
the  solution  to  the  Arth  problem  in  terms  of  the  solutions  to  the  fc  —  1  subproblems 
already  solved.  Program  G3,  however,  achieves  that,  and  thus  qualifies  as  an 
instance  of  dynamic  programming. 

Refinement  C4  is  obtained  from  C3  by  breaking  down  the  computation  of 
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Figure  7.5:  Derivation  of  the  first  shared-variable  solution  to  the  all-pair, 
shortest-paths  problem 
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the  fc-fringe  of  v.  More  precisely,  we  use  the  refinement 
[Vv  €  V.F[k  —  1,  v]  =  fringe(k  —  1,  v),  F] 

F[k,v]:=\jf^y^{v''  e  fringe{k  -  |  D[v,v'']  =  nil} 

>- 

new  <1  =  0  in 

new  t2  9  in 

F[fc-i,t-]  j-  if  D[v,  v"]  -  nil  then  U  {v")]; 

<1  :=ti  U  t2  ’ 

end 

F[k,v]:=ti 

end; 

[F[k,  i;]  =  fringe{k,  v),  A] 

where 

r  =  Preds{{D[v,  x],  F[k  —  l,x],F[k,x]  |  x  6  1^}) 

A  =  =  /nnpe(Ar,  i;)}  U 

Preds{{D[x,  x%F[k  ^  l,x]  \  x,x'  e  V}  U  {F[k,  x]\  x  e  V  Ax  v}). 

Note  that  the  correctness  of  this  refinement  crucially  depends  on  the  atomicity 
of  the  assignments  to  and  <2- 

7.2.1  Deriving  a  distributed  implementation 

Suppose  that 

•  every  vertex  only  knows  its  immediate  neighbours,  that  is,  jE”  is  not  globally 
known  or  accessible, 

•  every  vertex  only  knows  its  own  fringe,  that  is,  F[k]  is  not  globally  avail¬ 
able, 

•  all  other  information  is  considered  non-local. 

We  want  to  derive  a  solution  to  the  all-pair  shortest  paths  problem,  that  is 
distributed  in  the  sense  that  all  non-local  information  that  a  vertex  v  may  need 
is  communicated  to  v  explicitly  through  message  passing. 

We  will  first  attempt  to  refine  C4  into  a  distributed  implementation.  Pro¬ 
gram  C4  requires  every  vertex  v  to  know  the  immediate  neighbours  t?"  of  every 
vertex  in  the  fringe  of  v.  Unless  v  =  v\  this  is  non-local  information  and  thus 
needs  to  be  communicated  via  channels  and  message-passing.  Consequently,  in 
each  iteration,  every  vertex  v'  in  the  graph  would  have  to  be  prepared  to  send 
a  list  I  of  its  immediate  neighbours  to  t;.  A  distributed  implementation  based 
on  C4  would  thus  have  the  following  advantages  and  disadvantages. 


Advantages: 
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•  The  size  of  I  is  bounded  by  the  maximal  number  of  neighbours  of  a  vertex. 

•  In  every  iteration  k,  vertex  v'  would  always  send  the  same  message  L 
Disadvantages: 

•  Since  vertex  does  not  know  the  vertex  v  such  that  v'  appears  in  the 
fringe  of  v,  v’  has  to  be  conservative  and  always  send  I  to  every  vertex 
in  the  graph.  Thus,  in  every  iteration,  messages  need  to  be  sent.  The 
total  number  of  messages  sent  (including  the  initialization)  would  thus  be 
n^.  This  number  is  independent  of  the  structure  of  the  input  graph.  For 
instance,  both  a  strongly  connected  graph  and  a  graph  with  no  edges  at 
all  would  give  rise  to  send  actions. 

•  Since  not  every  vertex  appears  in  the  fringe  of  another  vertex  v  in  every 
iteration,  some  of  these  messages  are  redundant  and  will  never  be  received. 
In  case  of  the  graph  with  no  edges,  none  of  the  messages  will  ever  be 
received.  All  of  them  are  redundant. 

Given  this  analysis,  we  conclude  that  C4  is  not  a  good  base  for  a  distributed 
implementation.  Instead,  we  revisit  the  second  refinement  (from  C2  to  C3)  and 
suggest  an  alternative  way  of  refining  C2  that,  hopefully,  leads  to  a  more  efficient 
and  less  redundant  distributed  implementation.  This  alternative  refinement  is 
summarized  in  Figure  7.6, 

Refining  C2  into  C4 

A  different  representation  of  Equation  (7.2)  gives  rise  to  an  alternative  way  to 
compute  the  fe-fringe  of  v.  The  Ar-fringe  of  v  is  now  obtained  by  considering  the 
k  —  1-fringes  of  the  immediate  neighbours  of  v.  Formally, 

fringe{k,  v)  =  €  fringe{k  -  1,  v')  \  D[v,  v'']  =  nil}. 

This  property  forms  the  basis  of  the  refinement  of  C2  into  ^^4,  re¬ 

finement  <74  breaks  down  the  computation  of  the  fringe.  The  formal  justification 
is  similar  and  thus  omitted. 

Refining  C4  into  C'^ 

From  C4  we  now  derive  a  distributed  implementation  which  is  given  in 
Figure  7.7.  Let  Cy  be  one  of  the  parallel  processes  of  the  top-most  parallel 
composition  in  C4.  Cy  needs  to  access  the  fringes  F[k  —  l,v']  of  all  vertices 
v'  it  is  adjacent  to.  This  information  is  non-local  to  Cy  and  thus  needs  to  be 
explicitly  communicated.  To  this  end,  we  introduce  a  two  dimensional  array 
of  local  channels.  Each  channel  c[v\  v]  will  be  used  by  vertex  F  to  make  its 
current  fringe  F[fc  — 1,  v']  available  to  v.  More  precisely,  the  channels  are  subject 
to  the  following  invariant.  Consider  the  kih  iteration.  If  (t^,t^^)  E  E  then 
c[v^,v]  contains  fringe{k  —  l,i;').  A  detailed  proof  of  the  refinement  step  is 
straightforward  and  omitted. 
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C2  =  llJf  lljf/  ifv  =  v'  then  D[v,  v^]  :=0  else  D[v,  v']  :=m7; 
for  Ar  =  1  to  n  —  1  do 
[llv  dist{v,  v")  =  k  then  D[v,  v”]  :=fc]] 
od 


C3  =  new  F[0,.n-- .?;„]  =  0  in 

\\y  lljf/  if  t;  =  i;'  then  D[vj  r']  :=0  else  jD[v,  v']  :=:m7; 
for  Ar  =  1  to  n  —  1  do 

V  F[k,  v]  :=  G  F[k  -  1,  V]  |  D[v,  v"]  =  nil}-, 

od 
end 

C!^  =  new  F[0..n—  l,vi..Vn]  =  0  in 

llw  Hi'  if  ^  =  'v'  then  D[v,  u']  :=0  else  D{v,  v']  :=nt7; 
for  A:  =  1  to  n  —  1  do 
=  0  in 

new  /  =  0  in 
/:=F[fc 
new  <2  =  0  in 
[||{„if  D[v,  v"]  =  nil  then 
t2:=<2U  {«"}]; 

<i:=ti  U  <2 
end 
end 
]:=ti 

D[v^v'*]  :=k 
od 
end 

Figure  7.6:  Derivation  of  the  second  shared-variable  solution  to  the  all-pair, 
shortest- paths  problem 
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We  compare  C5,  the  refinement  based  on  with  the  refinement  based  on 
C4  that  we  rejected  above.  has  the  following  advantages  and  disadvantages. 

Advantages: 

•  In  each  iteration,  every  vertex  v'  knows  exactly  which  other  vertex  to 
communicate  with.  More  precisely,  v'  needs  to  send  its  current  fringe  to 
all  its  immediate  neighbours.  The  number  of  messages  sent  thus  depends 
on  the  structure  of  the  graph.  If  the  graph  has  no  edges,  no  message  are 
sent.  If,  however,  the  graph  is  strongly  connected,  messages  are  sent. 
Roughly  speaking,  the  fewer  edges  the  graph  has,  the  fewer  messages  are 
sent. 

•  There  are  no  redundant  messages.  Every  message  that  was  sent  will  also 
be  received  in  the  next  iteration. 

•  In  contrast  to  C4,  the  k  —  1-fringe  of  v  is  not  directly  needed  to  compute 
the  ^-fringe  of  i;.  In  every  iteration,  F[k,v]  is  computed  based  solely  on 
the  input  from  the  neighbours  of  v.  In  other  words,  F[k  —  1,1?]  does  not 
need  to  be  kept  across  iterations. 

Disadvantages: 

•  The  size  of  I  is  bounded  only  by  n  —  1.  No  better  bound  can  be  given. 

•  In  general,  in  every  iteration  A?,  vertex  v'  would  send  a  different  message 
L 

We  conclude  that  C'^  features  better  best-case  behaviour  and  less  redundancy 
than  a  distributed  implementation  based  on  C4. 

Refining  C5  into  Cq 

While  the  computation  of  fringe{k^v)  in  C4  requires  direct  access  to  variables 
F[k  —  1,^^]  for  all  v'  such  that  (v^v^)  G  Ey  the  corresponding  computation 
in  Cs  does  not.  Consequently,  the  space  requirements  of  C5  can  be  reduced 
by  removing  the  array  E,  initializing  the  channel  c[v\v]  directly  with  {v^}, 
wrapping  the  declaration  of  a  new  local  variable  Fk^v  around  Cyy  and  replacing 
F[kyv]  by  Fk^v  in  Cv^ 

We  conclude  this  example  with  an  overview  of  all  refinements  performed  in 
this  section  in  Figure  7.8. 
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Figure  7.7;  Derivation  of  a  distributed  solution  to  the  all-pair,  shortest-paths 
problem 
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Figure  7.8:  Overview  of  solutions  to  the  all-pair,  shortest-paths  problem 
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Chapter  8 

A^-process  mutual  exclusion 
algorithms 


In  this  section,  we  will  apply  our  refinement  framework  to  n-process  mutual  ex¬ 
clusion  algorithms.  This  example  differs  from  the  ones  in  the  previous  sections 
in  two  respects.  First,  due  to  the  complex  nature  of  the  problem  that  these 
algorithms  attempt  to  solve,  the  correct  behaviour  cannot  be  specified  as  suc¬ 
cinctly  as  for  other  examples.  However,  using  our  specification  language  we  can 
nonetheless  identify  an  abstract,  high-level  representation  of  the  algorithm.  We 
will  first  verify  this  high-level  version  and  then  successively  refine  it.  The  second 
difference  is  that  the  rules  of  our  refinement  calculus  turn  out  to  be  insufficient. 
In  particular,  a  more  specialized  rule  allowing  the  introduction  of  parallelism  is 
needed.  The  necessary  new  refinement  rules  need  be  introduced  along  the  way. 
As  in  the  previous  examples,  the  derivation  of  alternative,  sometimes  more  ef¬ 
ficient  versions  will  play  an  important  role. 


8.1  Introduction 

Suppose  a  resource  is  to  be  shared  between  a  number  of  processes.  For  consis¬ 
tency  reasons,  at  most  one  process  can  access  the  resource  at  a  time.  Examples 
for  these  kinds  of  resources  are  printers  or  databases.  Mutual  exclusion  al¬ 
gorithms  solve  this  problem  by  granting  access  to  the  resource  in  a  mutually 
exclusive  fashion.  Consider,  for  instance,  the  two-process  tie-breaker  algorithm 
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TIE (2)  for  mutual  exclusion  [PetSl]. 

new  m[1..2]  =  OJast  =  0  in 
while  it  do 
m[l]:=l  ;/as<:=l; 
await  m[2]  <  m[l]  V  last  ^  1; 
cri; 

m[l]  :=0; 
nci 

od 

end 


while  it  do 
m[2]  ;=1 ;  last  :=2; 
await  m[l]  <  in[2]  V  last  ^  2; 
cr2; 

m[2]:=0; 

nc2 

od 


Neither  the  critical  sections  cri  and  cr2  nor  the  non-critical  sections  nci  and  nc2 
change  the  values  of  in  or  last.  Moreover,  cri  and  cr^  are  assumed  to  always 
terminate.  To  prove  that  two  processes  cannot  be  in  their  critical  regions  at  the 
same  time,  one  can  attempt  to  find  two  predicates  Pi  and  P2  such  that 

•  Pi  is  always  true  whenever  the  left  process  is  executing  cri  j 

•  P2  is  always  true  whenever  the  right  process  is  executing  era,  and 

•  Pi  A  P2  is  unsatisfiable. 

Unfortunately,  for  TIE{2)  this  turns  out  to  be  impossible.  Consider,  for  in¬ 
stance,  the  obvious  candidates  for  Pi  and  P2 

Pi  =  m[l]  =  1  A  (m[2]  =  0  V  last  =  2) 

P2  =  m[2]  =  1  A  (m[l]  =  0  V  last  =  1). 


While  Pi  A  P2  is  indeed  unsatisfiable,  Pi  does  not  hold  when  Ci  is  in  its  critical 
region  and  C2  has  just  expressed  interest  by  executing  in[2]:=l  but  not  yet 
set  last  to  2.  Similarly  for  P2  and  C2.  The  problem  is  the  intermediate  state 
between  the  two  assignments.  A  popular  solution  is  to  augment  the  specification 
language  such  that  we  can  express  that  control  is  between  two  statements  and 
thus  in  the  problematic  intermediate  state.  This  can  be  achieved  with  the  help 
of  either  location  predicates  [MP95]  or  auxiliary  variables  [OG76a,  A091].  If, 
for  instance,  two  boolean  auxiliary  variables  mid[l]  and  mid[2]  are  used  and 
TIE(2)  is  modified  to  r/P'(2), 


new  m[1..2]  =  0,  last  =  0,  mid[l]  =  ff^  mid[2]  =  ff  in 


while  tt  do 
m[l],  mid[l]  :-l^tt; 
last^  mid[\]  :=1,  j9^; 
await  in]2]  <  fn[l]  V  last  ^  1; 
cri; 

m[l]  :=0; 
nci 

od 


while  tt  do 

m[2],m2d[2]  :^l^tt\ 

last^  mid[2]  :=2, 

await  m[l]  <  m[2]  V  last  ^  2; 

cr2; 

m[2]:=0; 

nc2 

od 


end 
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then  Pi  and  P2  can  be  chosen  to  be 

Pi  =  m[l]  =  1  A  -~^mid[l]  A  {in[2]  =  0  V  mid[2]  V  last  =  2)  and 

P2  =  in[2]  =  1  A  -^mid[2]  A  (m[l]  =  0  V  mid[l]  V  Iasi  =  1). 

However,  this  technique  poses  some  problems.  The  introduction  of  auxiliary 
variables  requires  a  deep  understanding  of  the  algorithm  and  the  property  to 
be  proved.  A  mechanization  of  this  process  seems  out  of  reach.  Moreover,  the 
method  does  not  scale  very  well.  The  number  of  necessary  auxiliary  variables 
increases  with  the  number  of  parallel  components.  The  predicates  quickly  be¬ 
come  unwieldy.  More  importantly,  the  auxiliary  variables  do  not  allow  us  to 
prove  another  important  property:  Every  process  that  is  interested  in  entering 
its  critical  region  must  eventually  be  allowed  to  do  so.  Note  that  this  property 
is  stronger  than  deadlock- freedom.  Unfortunately,  the  introduction  of  auxiliary 
variables  does  not  help  with  the  verification  of  this  property. 

In  this  section,  we  show  how  our  refinement  calculus  can  be  used  to  verify 
the  n-process  tie-breaker  algorithm  without  the  use  of  auxiliary  variables.  We 
first  prove  correctness  of  an  abstract,  high-level  version  which  is  then  refined 
successively  into  the  desired  implementation.  Additionally,  we  will  derive  sev¬ 
eral  diiferent  and  sometimes  more  efficient  implementations  and  thus  expose 
the  various  “degrees  of  implementation-freedom”  that  the  n-process  tie-breaker 
solution  offers. 


8.2  iV-process  mutual  exclusion  algorithms 

We  assume  that  an  n-process  mutual  exclusion  algorithm  MX  has  the  following 
general  form 


MX(cr,  nc) 

=  new  xi  : 

=  Vi,...,Xm  =  Vm  in  ||"_1  Ci 

where 

Ci  = 

while  tt  do 

entry^; 

(*  entry  protocol  *) 

cn; 

(*  critical  region  *) 

exiti; 

(*  exit  protocol  *) 

nCi 

od 

non-critical  region  *) 

and  cr  and  nc  stand  for  cri , . . . ,  cr„  and  nci , . . . ,  nc^  respectively.  Additionally, 
we  impose  the  following  restrictions.  For  all  1  <  z  <  n, 

•  ex  Hi  and  crj  are  always  terminating, 

•  a  subset  of  the  local  variables  is  reserved  entirely  for  the  sake  of  synchro¬ 
nization.  Thus,  the  values  of  these  variables  are  only  changed  in  the  entry 
and  exit  protocols  and  left  unchanged  by  cri  and  nc^. 

Note  that  the  non-critical  region  ncj  may  not  terminate.  Sometimes  we  will 
abbreviate  MX  (cr,  nc)  by  MX  if  the  particular  shape  of  the  critical  and  non- 
critical  regions  is  either  understood  from  the  context  or  irrelevant. 
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8.3  Correctness  criteria  for  mutual  exclusion  al¬ 
gorithms 

The  following  definition  formally  expresses  what  it  means  for  a  mutual  exclusion 
algorithm  to  be  correct. 

Definition  8.1  (Correctness  criteria) 

1.  Given  a  mutual  exclusion  algorithm  MX,  a  subprogram  C  of  MX is  called 
non-interfering  with  respect  to  MX  if  C  does  not  change  any  of  the  vari¬ 
ables  used  in  either  the  entry  or  exit  protocols  of  MX . 

2.  A  critical  region  cr,*  is  called  well-formed  iff  it  is  always  terminating  and 
non-interfering  with  respect  to  MX. 

3.  A  non-critical  region  nci  is  called  well-formed  it  is  non-interfering  with 
respect  to  MX , 

4.  A  critical  region  cr,-  is  called  indicative  if  it  is  of  the  form 

where  is  a  fresh  boolean  variable  that  is  not  used  anywhere  else.  The 
idea  is  that  is  true  along  an  execution  of  MX{W,nc)  if  and  only  if 
process  i  is  currently  in  its  critical  region.  We  assume  that  without  loss  of 
generality  every  non-indicative  critical  region  can  be  made  to  be  indicative 
by  adding  the  indicator  assignments. 

5.  A  non-critical  region  nc,-  is  called  indicative  if  it  is  of  the  form 

p?‘:=rt;nc;;pr:=# 

where  is  a  fresh  boolean  variable  that  is  not  used  anywhere  else.  Again, 
the  intuition  is  that  is  true  along  an  execution  of  MX  {cf,  nc)  if  and 
only  if  process  i  is  currently  in  its  non-critical  region.  We  assume  that 
without  loss  of  generality  every  non-indicative  non-critical  region  can  be 
made  to  be  indicative  by  adding  the  indicator  assignments. 

6.  Informally,  a  mutual  exclusion  algorithm  MX  satisfies  the  mutual  exclu¬ 
sion  property  if  MX  does  not  allow  for  an  execution  along  which  more 
than  one  process  is  executing  its  critical  region  at  the  same  time.  More 
precisely,  for  all  well-formed  non-critical  regions  nc  and  all  well-formed 
and  indicative  critical  regions  it  is  not  the  case  that  MX{cf,nc)  has 
an  execution  that  contains  some  state  s  such  that  there  exist  two  distinct 
processes  I  <  iyj  <  n  such  that  p^^*  and  p^  are  both  true  in  s, 

7.  A  B- synchronization  statement  is  a  statement  of  the  form 

await  B 


or 


while  -^B  do  skip. 
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8.  Informally,  a  mutual  exclusion  algorithm  MX  satisfies  the  eventual  entry 

property  control  always  eventually  gets  past  every  synchronization  state¬ 
ment  in  every  entry  protocol  in  MX .  More  precisely,  for  all  well-formed 
critical  regions  ^  and  well-formed  and  indicative  non-critical  regions  nc, 
it  is  not  the  case  that  there  is  an  execution  a  of  MX  (^,  nc)  and  a  process 
1  <  ^  such  that  pf^  eventually  remains  false  forever  along  a. 

9.  A  mutual  exclusion  algorithm  is  correct,  if  it  satisfies  the  mutual  exclusion 

and  the  eventual  entry  property.  Note  that  deadlock-freedom  is  implied  by 
eventual  entry.  □ 

Note  that  the  variables  pf  and  p^^  are  used  only  to  define  correctness  formally 
and  not  for  the  verification  of  the  algorithm  itself.  They  thus  differ  from  the 
auxiliary  variables  used  in  [OG76a,  A091]  which  are  essential  for  the  verification. 

The  following  lemma  characterizes  eventual  entry  in  terms  of  context-sensitive 
approximation.  This  characterization  will  later  allow  us  to  formulate  a  sufficient 
condition  for  eventual  entry. 

Lemma  8.1  (Characterizing  eventual  entry) 

A  mutual  exclusion  algorithm  MX  has  the  eventual  entry  property  if  and  only 
if  the  behaviour  of  every  ^-synchronization  statement  S  in  every  entry  protocol 
in  MX  is  captured  by  a  single  stuttering  step  in  B,  that  is, 

5  {B} 

where  E  is  such  that  MX  =  E[B\. 

Proof:  Suppose  MX  does  not  satisfy  eventual  entry.  Thus,  there  are  well- 
formed  and  indicative  non-critical  regions  such  that  MX  {cr,  nc)  has  an  execu¬ 
tion  a  along  which  pf  ^  remains  false  forever  for  some  i.  Since  exiii  and  cr^ 
terminate,  process  i  must  be  executing  its  entry  protocol  forever.  Since  syn¬ 
chronization  statements  are  the  only  non-terminating  statements  in  the  entry 
protocol,  process  i  must  be  blocked  forever  at  a  ^-synchronization  statement 
S  for  some  B.  In  that  case,  however,  the  behaviour  of  S  not  identical  to  finite 
stuttering,  that  is,  S  for  the  relevant  context  E. 

If,  on  the  other  hand,  MX  contains  a  ^-synchronization  statement  whose 
behaviour  goes  beyond  finite  stuttering  in  the  entry  protocol  of  some  process 
i,  then  MX  has  an  execution  along  which  process  i  eventually  never  leaves  its 
entry  protocol.  If  na  is  indicative,  will  remain  false  forever.  ■ 

Recall  that  the  definition  of  every  ^-synchronization  statement  consists  of  two 
disjuncts  where  the  second  one  deals  with  the  case  that  B  never  becomes  true. 
The  above  characterization  implies  that  if  C  has  the  eventual  entry  property, 
then  this  disjunct  can  be  removed  without  changing  the  set  of  executions  of  C . 

Corollary  8.1  (Simplifying  synchronization  statements) 

Suppose  C  has  the  eventual  entry  property.  If  C  is  of  the  form  jS*!  [await  B]  for 
some  El ,  then 


C  =  £’i  [await  R]  £’i[{5}]. 
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If  C  is  of  the  form  [while  -<B  do  skip]  for  some  E2,  then 
C  =  ^^2 [while  -iJ3  do  skip]  =£t  E2[{B}]. 
Proof:  Using  Lemma  8.1,  Lemma  4.2  and  the  fact  that 
await  B  while  -'B  do  skip. 


■ 


We  need  to  be  able  to  express  that  a  property  always  holds  when  a  certain 
program  fragment  is  executed. 

Definition  8.2  {B  holds  during  C  in  C) 

Given  a  program  C  with  a  subprogram  C' ,  that  is,  C  =  E\C'\  for  some  E,  and 
given  a  property  B  we  say  that  B  holds  during  C  in  C  if  for  every  execution 
of  C  property  B  is  always  true  when  control  resides  in  C .  Formally, 

E[C]  =£t  E[C\B] 


where  C\B  adds  B  to  the  pre-  and  postcondition  of  every  atomic  transition  of 
G,  that  is. 


U:[P,Q][5 

{Ci-,C2)\B 

(GiVG2)[S 

{Ci\\C2)\B 

C*\B 

C"[S 

(new  X  =  e  in  C)\B 


V:[PAB,QAB] 
{Ct\B)-,(C2\B) 
(Ci\B)V{C2\B) 
{Ci\B)  II  {C2\B) 

iC\Br 

(Gfsr 

new  X  =  e  in  {C\B). 


□ 


Note  the  above  notation  could  also  be  defined  using  an  assumption-commitment 
formula.  Informally,  B  holds  during  C  in  C,  if 

[B,r]  C  [B,AU{5}] 

such  that  the  aissumptions  B  and  F  can  be  “discharged”  in  the  environment  E 
of  C.  This  definition  is  more  informative,  but  also  more  inconvenient,  because 
it  requires  an  explicit  statement  of  the  assumptions  necessary  for  B  to  hold 
during  C. 

To  obtain  a  tractable,  sufficient  condition  for  the  eventual  entry  property, 
we  borrow  a  technique  from  sequential  programming.  To  prove  termination 
of  a  loop  while  B  do  C  in  a  sequential  program,  we  find  an  expression  m 
such  that  m  is  always  non-negative,  and  m  =  0  implies  -<B,  and  m  is  always 
decreased  by  C.  To  show  eventual  entry,  we  find  such  an  expression  for  every 
B-synchronization  statement  in  the  program  and  show  that 

•  rn  is  always  non-negative,  and 
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•  m  =  0  implies  By  and 

•  m  will  always  eventually  be  set  to  zero  by  the  environment. 

The  following  lemma  is  based  on  this  idea.  If  a  certain  predicate  P  is  known 
to  hold  during  the  synchronization  statement,  then  this  information  can  be 
incorporated  too. 

Lemma  8.2  (Eventual  entry  in  parallel  program) 

Let  C  be  the  parallel  composition  C  —  Co  ||  Ci.  Given  an  arithmetic  expression 
m  over  the  variables  in  C,  let  Am  denote 

Am  ^  Var:[tty  m=  0  — )■  m  =  0  |  m  <m] 

where  Pif  Pthen  \  Peise  stands  for  {Pij  =>  Pthen)A(^Pif  =>  Peise)^  Intuitively, 
Am  decreases  m  if  it  is  not  zero  and  leaves  it  unchanged  if  it  is  zero. 

•  Ci  has  the  eventual  entry  property,  iff  for  every  ^-synchronization  state¬ 
ment  S  in  Ci  there  exists  a  predicate  P  and  an  expression  m  over  the 
variables  in  C  such  that 

1.  P  holds  during  S  in  C,  and 

2.  m  >  0,  and 

3.  P  A  m  =  0  implies  By  and 

4.  the  parallel  environment  of  Ci  either  decrements  m  infinitely  often 
or  does  not  allow  for  -«P  to  be  true  infinitely  often.  Formally, 

Ci-i  Cqrt  {inv*m  ;  Am)"^  V  {inv*m  ;  Am)*  ;  Dy 

for  some  D  such  that  D  has  no  execution  along  which  -iP  is  true 
infinitely  often. 

•  C  has  the  eventual  entry  property  iff  Co  and  Ci  do. 

Proof:  See  Section  A. 4.1.  ■ 

To  prove  that  control  always  eventually  gets  past  a  P-synchronization  state¬ 
ment  in  Co,  the  third  condition  of  the  above  lemma  thus  requires  us  to  show 
that  along  every  trace  of  the  environment  of  Co,  the  synchronization  condition 
P  cannot  be  false  infinitely  often,  and  thus  Co  cannot  block  at  the  synchroniza¬ 
tion  statement  forever.  Note  that  the  lemma  supports  compositional  reasoning 
in  the  sense  that  the  eventual  entry  of  Co  is  determined  by  solely  looking  at  its 
parallel  environment  Ci.  However,  the  third  condition  requires  us  to  consider 
the  executions  of  the  environment,  which  cannot  be  done  compositionally. 
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8.4  Examples  of  n-process  mutual  exclusion  al¬ 
gorithms 

We  now  present  three  n-process  mutual  exclusion  algorithms  following  the  ex¬ 
position  in  [And91]. 

Example  8.1  (Tie-breaker  algorithm) 

We  present  the  tie-breaker  algorithm  —  also  called  Peterson’s  algorithm  [PetSl]. 
The  entry  protocol  in  each  process  consists  of  a  loop  that  iterates  through  n  —  1 
levels.  A  process  will  only  be  allowed  to  enter  the  critical  region,  if  it  has 
completed  all  n  —  1  levels.  The  last  process  to  enter  a  level  I  will  be  forced  to 
remain  on  that  level  until  another  process  joins  that  level  and  thus  becomes  last. 
Thus,  informally,  one  process  will  always  remain  on  each  level.  Since  there  are 
n  “  1  levels  and  n  processes,  at  most  two  processes  can  be  on  the  highest  level 
at  the  same  time.  The  synchronization  then  ensures  that  only  one  of  them  can 
complete  the  highest  level.  It  is  this  process  which  will  be  allowed  to  progress 
into  its  critical  region.  The  synchronization  conditions  in  the  entry  protocol  are 
weak  enough  to  ensure  deadlock  freedom.  Eventual  entry  is  guaranteed  because 
the  condition  that  a  process  i  is  waiting  on  will  always  eventually  become  true 
and  then  remain  true  until  i  “moves  on”  to  the  next  level.  Let  TIE  be 

TIE{W^ric)  =  new  m[l..n]  =  0,/asf[l..n]  =  0  in  ||”=i  Cf 

where  Ci  is 

Ci  =  while  it  do 

for  /:=!  to  n  —  1  do 
in[i]  :=/; 
last[l]  :=i; 

for  j  :=  1  to  n  st  j  ^  i  do 
while  in[j]  >  I  A  /as^[/]  =  i  do  skip 
od 
od; 
cr,; 

m[2]:=0 ; 

nci 

od 

and  where  the  values  of  in  or  last  are  not  changed  in  cr*  or  nc,-.  Note  that  cr,* 
and  nci  are  well-formed  iff 

cvi  C-fi  inv*  {in^last} 
nci  C'lt  inv^  {injast}. 

Moreover,  note  that  the  algorithm  is  still  correct  if  the  execution  of  as¬ 
signments  and  the  evaluation  of  expressions  is  not  assumed  to  be  atomic.  In 
Section  8.8  we  will  discuss  an  extension  of  our  framework  to  handle  this  case. 

□ 


entry 

protocol 


exit  protocol 
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Example  8.2  (Bakery  algorithm) 

The  next  algorithm  achieves  mutual  exclusion  by  assigning  a  unique  number 
turn[i]  to  each  process  i  that  intends  to  enter  its  critical  region.  This  number  is 
chosen  to  be  greater  than  all  the  numbers  already  assigned  so  far.  Permission 
to  enter  the  critical  region  is  granted  to  process  i  when  all  remaining  processes 
j  either  are  not  interested  in  entering  {turn[j]  —  0)  or  have  a  greater  number 
{turn[j]  >  iurn[i]).  Deadlock  cannot  occur  since  the  non-zero  values  of  turn 
are  unique.  Eventual  entry  is  guaranteed  for  the  same  reasons  as  for  the  tie¬ 
breaker  algorithm.  Since  a  similar  scheme  is  adopted  in  some  grocery  stores, 
this  algorithm  is  called  the  bakery  algorithm.  Let  BAK  be 

BAK{^^nc)  =  new  turn[l..n]  =  0  in  Ci 

where 

Ci  =  while  it  do 

turn[i\:-max{turn[j]  |  1  <  j 
for  j  :=  1  ton  st  j  ^  i  do 
await  turn[j]  =  QV  iurn[i] 
od 
cri; 

turn[i]  :=0  ; 
nci 

od 

where  the  value  of  turn  is  not  changed  in  cri  ov  nci.  Note  that  cr^  and  nci  are 
well-formed  iff 

cri  Qt^  inv*  {turn} 
nci  Cq-t  inF^{turn}. 

In  contrast  to  the  tie- breaker  algorithm,  the  correctness  of  the  bakery  algorithm 
relies  on  the  atomicity  of  the  assignments  and  tests  in  the  entry  and  exit  pro¬ 
tocols.  □ 

Example  8.3  (Ticket  algorithm) 

The  ticket  algorithm  is  similar  to  the  bakery  algorithm.  The  ticket  algorithm 
differs  from  the  bakery  algorithm  in  that  it  uses  a  single  global  counter  num  to 
set  the  local  counter  turn[i]  rather  than  all  of  the  other  local  counters  turn[j] 
for  j  ^  i.  Let  TIC  be 

TIC{cr^nc)  =  new  num  =  1,  nea;^  =  1,  tnrn[l..n]  =  0  in  ||7=i  C'i 
where 

I  entry  protocol 


Ci  =  while  tt  do 

turn[i] ,  num  :-num^  num  -f  1; 

await  turn[i]  =  nea?f; 

cn; 

next\-next  1 ; 

nCi 


<  n}  -h  1; 

<  turn[j\ 


entry 

protocol 


exit  protocol 


exit  protocol 
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where  the  values  of  num,  next  or  turn  are  never  changed  in  cr,-  or  ncj.  Note 
that  cvi  and  nci  are  well-formed  iff 

mv’^  {num,  next f  turn} 
nci  Qjx  inv^ {num,  next,  turn}. 

Like  the  bakery  algorithm,  the  ticket  algorithm  relies  on  the  atomicity  of  the 
assignments  and  the  test  in  both  the  entry  and  the  exit  protocol.  □ 


8.5  Verification  strategy 

This  section  demonstrates  how  a  n-process  mutual  exclusion  algorithm  MX  can 
be  verified  using  our  framework.  To  verify  MX  we 

1.  find  an  appropriate  coarse-grained  representation  MX\ 

2.  verify  MX'  using  program  transformation,  invariants  and  induction,  and 

3.  successively  refine  MX'  into  MX  by  a  sequence  of  correctness-preserving 
program  transformation  steps. 

The  correctness  of  MX'  implies  the  correctness  of  MX  by  the  following  lemma. 

Lemma  8.3  (Correctness  and  execution  inclusion) 

Suppose  we  have 

MX{j^,nc)  MX'{^,nc).  (8.1) 

for  some  MX'  and  for  all  well-formed  W  and  nc.  Then, 

1.  MX'  satisfies  mutual  exclusion  if  MX  does. 

2.  MX'  has  the  eventual  entry  property  if  MX  does. 

Proof:  1)  Suppose  MX  satisfies  mutual  exclusion  but  MX'  does  not,  that  is, 
there  are  well-formed  nc  and  well-formed  and  indicative  cr  such  that  MX'{^,  nc) 
has  an  execution  a  in  which  process  i  and  process  j  execute  their  critical  regions 
simultaneously.  Due  to  (8.1),  a  also  is  an  execution  of  MX{W,nc).  This,  how¬ 
ever,  contradicts  the  assumption  that  MX  is  mutually  exclusive.  2)  Suppose 
MX  satisfies  eventual  entry  but  MX'  does  not,  that  is,  there  are  well-formed 
W  and  well-formed  and  indicative  nc  such  that  MX'{W,nc)  has  an  execution  a 
along  which  some  process  i  eventually  ‘‘gets  stuck”  in  its  entry  protocol,  that 
is,  predicate  eventually  remains  false  forever  along  a.  Due  to  (8.1),  a  also 
is  an  execution  of  MX{W,nc).  This,  however,  contradicts  the  assumption  that 
MX  satisfies  eventual  entry.  ■ 

We  will  now  illustrate  the  verification  of  an  n-process  mutual  exclusion  algo¬ 
rithm  using  the  tie-breaker  algorithm  as  an  example.  An  appropriately  abstract, 
coarse-grained  representation  is  introduced  and  verified  in  Section  8.6.  The  suc¬ 
cessive  refinement  of  that  abstract  representation  is  dealt  with  in  Section  8.7. 
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8.6  Verification  of  coarse-grained  algorithms  us¬ 
ing  invariants 

To  be  able  to  verify  TIE  conveniently  using  invariants  and  without  auxil¬ 
iary  variables  or  program  locations,  we  choose  the  following  considerably  more 
coarse-grained  representation  (the  subscript  indicates  the  atomic  eval¬ 

uation  of  the  assignments  to  in  and  last  and  the  test  in  the  entry  protocol). 
Unless  explicitly  stated  otherwise,  i,  j^k,l  and  x  range  over  1, . . . ,  n. 

=  new  m[l..n]  =  0,/as<[l..n]  =  0  in  \\f-i  Ei[entryi] 

where  Ei  is 

Ei  ~  while  it  do 

for  to  n  1  do 
[] 
od; 

err, 

in[i]  :=0; 
nci 

od 

and 

entryi  =  in[i]Jast[in[i]  -f  1]  :=:in[i]  +  1,  i; 

await  \/j  ^  i.in[j]  <  in[i]  V  last[in[i]]  ^  i 

where  Vj  /  i.B  stands  for  VI  <  j  <  n.j  i  ^  B.  Compared  to  TIE  there  are 
two  differences. 

1.  The  sequential  composition  of  the  two  assignments  in[i]:=l  ;lasi[l]:=i  is 
replaced  by  the  multiple  assignment  in[i],last[in[i]  H-  1]  :=m[i]  +  1,  i. 

2.  The  loop  of  tests 


od 


is  replaced  by  one  high-level  test 

await  Vj  i.in[j]  <  in[i]  V  last[in[i]]  ^  i. 

Very  often,  we  will  also  use  the  equivalent,  yet  more  mnemonic  notation 


where 


await  highest{i)  V  --^la$t{i) 

highest{i)  =  Vj  ^  i>iTi[j]  <  in[i] 
last{i)  ~  {last[in[i]]  —  i). 


In  the  rest  of  this  section  we  will  give  a  detailed  verification  of  this  coarse¬ 
grained  representation  TIE\^  We  start  with  mutual  exclusion. 
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8,6.1  Mutual  exclusion 

The  following  predicates  will  be  important. 

last{i)  =  lasi[in[i]]  =  i 
highest{i)  =  Vj  ^  i.in[j]  <  in[i] 

tower{i)  =  VI  <  /  <  in[i].in[lasi[l]]  =  I 

where  I  <i  <  n.  Intuitively,  process  i  satisfies 

♦  last{i)  iff  i  is  not  the  last  process  to  have  entered  the  level  that  it  is 
currently  on. 

♦  highest{i)  iff  all  other  processes  are  below  i. 

♦  tower{i)  iff  for  all  levels  /  below  the  level  that  i  is  on,  last[l]  points  to  a 
process  that  is  on  that  level.  Process  i  can  thus  be  thought  of  “standing 
on  the  shoulders”  of  in[i]  —  1  other  processes. 

To  prove  mutual  exclusion  for  TIE^^  we  first  show  that  it  preserves  the 
invariant  Pi  AP2  defined  below.  Intuitively,  Pi  expresses  that  throughout  every 
execution  of  TIE\^  every  process  either  is  the  highest  or  is  standing  on  a 
tower.  P2  says  that  if  a  process  i  is  on  a  level  /  greater  than  0,  then  la$t[l]  points 
to  a  process  that  is  also  on  /. 

Lemma  8.4  (Invariant  of  the  tie-breaker  algorithm) 

Let  Ci  range  over  the  n  processes  in  TIE\^^^^,  Let 


Pi 

= 

VI  <  f  <  nAoweT{i)  V  highest{i) 

P2 

VI  <  2  <  nAn[i]  >  0  ^  tn[/a5t[m[2]]]  =  in\ 

p 

= 

Pi  A  P2 

Bi 

= 

in[i]  >  0  (highest{i)  V  -^last{i)) 

B 

= 

VI  <  2  <  n.Bi 

boti 

= 

in[i]  =  0. 

Then, 

[P  A  VI  <i<  n.boti,  Pr€ds{  Var)]  ||JLi  Ci  [tt,  {P}] . 

If  started  in  a  state  in  which  all  processes  are  on  level  0,  and  that  satisfies  P  and 
in  an  environment  that  preserves  every  predicate,  then  the  program  ||JLiC,*  will 
preserve  the  predicate  P.  In  other  words,  P  always  holds  along  every  execution 
of||?=iQ. 

Proof:  By  induction  over  n.  As  is  common  in  inductive  proofs,  we  will  prove  a 
slightly  stronger  statement,  which  gives  us  a  more  general  induction  hypothesis 
and  then  specializes  to  the  desired  result.  Let  N  denote  iV  =  {1, . . .,  n}.  Given 
a  set  J  C  A  and  predicates  Bj  for  each  j  £  let 

Bj  =  {Bj  I  i  €  J} 

Tj  =  Bjl)  Preds({in[j]  \  j  €  J}) 
botj  =  Vj  E  J.botj, 
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Given  a  set  J  of  process  indices,  Tj  contains  for  ea.ch  process  i  in  J,  the  await 
condition  Bi,  and  all  predicates  over  in[i].  For  an  environment  to  satisfy  the 
assumptions  Fj,  it  must  preserve  Bi  and  leave  in[i]  unchanged  for  all  i  in  J. 
We  show 


[P  A  {-P}  G  11^-;^  Ci  \ity  {P}  U 

by  induction  over  j.  Note  how  weakening,  that  is,  strengthening  of  the  assump¬ 
tions,  and  j  —  n  imply  the  desired  result. 

Base:  j  —  1.  We  show  in  Section  A.4.2,  page  256,  that  process  Ci  preserves  P 
and  all  predicates  in  F N\{i}  provided  that  the  environment  preserves  P  and  all 
predicates  in  and  the  initial  state  satisfies  PA  bot^Q. 

[P  A  {P}  U  r{i}]  Ci  {P}  U  riv-{i}]  (8.2) 

for  all  1  <  i  <  n  which  implies  the  base  case 

[PA  {P}  U  r^i}]  Cl  [-^^5  {P}  U  riv-{i}]. 

Step:  j  =  f  I .  By  induction  hypothesis, 

[P  A  bot^i  j/'^ ,  {P}  U  F^i^..,  j/j]  \\i=zi  {^}  G  • 

Also,  using  the  assumption-commitment  formula  (8.2), 

[p  A  6o<{j-+i},{P}ur{j-+i}]  Cj>+1 

Then,  since  the  guarantees  imply  the  assumptions 


mur^i . c{P}ur^_^,-,+i, 


and 

{p}ur{,/+i}C{P}ur{j-+i, 

we  conclude  with  rule  PAR-V  that 

[p  A  {P}  U  r{i_...j/+i}] 

IlflVCi  PAR-V 

[«,  {P}  U  r{j/+2,...,n}}]  • 

This  completes  the  induction.  ■ 

Figure  8.1  below  illustrates  the  invariant 

Pi  =  VI  <  7  <  n.tower{i)  V  highest{i) 

for  the  special  case  of  ten  processes,  that  is,  for  n  =  10.  The  processes  are 
represented  by  the  numbers  1  through  10,  The  levels  of  the  pyramid  represent 
the  possible  values  of  a  field  of  in.  More  precisely,  process  i  is  shown  in  level 
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Figure  8.1:  Illustration  of  invariant  Pi  of  the  tie-breaker  algorithm 


I  iff  in[i]  =  1.  For  instance,  process  5  is  on  level  6  in  both  pyramids,  that  is, 
m[5]  =  6.  In  the  left  pyramid,  process  5  is  the  highest,  that  is,  highest{5)  holds. 

In  the  right  pyramid,  process  5  has  been  overtaken  by  process  4  and  thus  ceases 
to  be  the  highest  process.  However,  it  now  stands  on  a  tower  of  processes,  that 
is,  we  have  tow€r{5). 

The  property  P  holds  throughout  the  entire  execution  of  TIE]^^  There 
are  a  few  properties  that  are  essential  for  showing  mutual  exclusion,  but  that 
only  hold  intermittently. 

Lemma  8.5  (Properties  during  synchronization,  critical  and  non-critical 
region) 

Let  topi^  boti  and  Bi  denote 

topi  =  in[i]  =  n  ~  1 
boti  =  *^[*1  =  0 

Bi  =  higb€st{i)  V  '^last(i). 


Then,  for  all  1  <  i  <  n, 

1.  the  predicate  -^boU  holds  during  the  synchronization  statement  await  Bi 
in 

2.  the  predicate  iopiABi  holds  during  the  critical  region  cr,*  in  TIE\i^^i^  and 

3.  the  predicate  boti  holds  during  the  non-critical  region  nc,*  in  TIE\^ 


Proof:  See  Section  A.4.3,  page  261. 
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Figure  8.2:  Illustration  for  the  proof  of  mutual  exclusion 

Proposition  8.1  (Mutual  exclusion  for  TIE\^ 

TIE\^  satisfies  the  mutual  exclusion  property. 

Proof:  To  show  mutual  exclusion  for  TIE\^  ^^(cr,  nc)  we  show  that  along  every 
execution  of  from  an  initial  state  satisfying  m[l..n]  =  0  Alast[l..n]  =  0 

at  most  one  process  is  in  its  critical  region.  Consider  an  execution  a  of  Hf-iCj 
starting  in  in[l..n]  =  0  A  /as^[l..n]  =  0  and  let  s  be  a  state  along  a.  Suppose  in 
state  s  process  i  is  in  its  critical  region.  Then,  due  to  Lemma  8.5,  5  also  satisfies 
top^  A  Bi.  Remember  that  in  Lemma  8.4  we  showed  that 

P  ~  P1AP2 

Pi  =  VI  <  2  <  n.tow€r{i)  V  highest{i) 

P2  =  VI  <  2  <  n.2n[2]  >  0  =>  in[last[in[i]]]  =  2n[2] 

is  an  invariant  in  Thus,  s  also  satisfies  P.  Suppose  for  a  contradiction 

that  at  least  one  other  process  j  is  in  its  critical  region,  too.  Then,  by  the  same 
argument  s  also  satisfies  topj  A  Bj.  Since  i  and  j  are  both  on  the  highest  level, 
neither  of  them  is  highest.  Thus,  P  implies  tower{i)  and  tower{j).  In  other 
words,  n  processes  are  distributed  over  n  —  1  levels  with  one  process  on  each 
level  between  1  and  n  —  1  and  processes  i  and  j  on  level  n  —  1.  Figure  8.2 
illustrates  this  situation  by  giving  an  example  for  the  case  of  ten  processes.  By 
P2  this  implies  that  /as^[n  —  1]  is  either  2  or  j.  Consequently,  Bi  A  Bj  is  false  in 
s  which  yields  the  desired  contradiction.  ■ 


176  CHAPTER  8.  N ^PROCESS  MUTUAL  EXCLUSION  ALGORITHMS 


8.6.2  Eventual  entry 

We  now  show  that  TIE\^  has  the  eventual  entry  property.  Let  Bi  be  the 
await  condition  in  C,*,  that  is,  Bi  =  ^  i,highest{i)  V  -^last(i).  We  will  use 

Lemma  8.2.  Let  mi  be  » 

__  f  0,  if  V  boti 

ruj  —  I  cond{in[j]  =  in[i]y  n,  in[{]  —  in[j]  {mod  n)),  otherwise 

where 

WD  X  /  ei,  iiB 

conJ(B,«„«,)  =  I  otherwise. 

Intuitively,  mi  =  0  iff  process  i  can  move  to  the  next  level.  Moreover,  if  m,'  >  0, 
then  mi  is  the  sum  over  the  number  of  levels  that  each  process  j  ^  i  is  away  from 
the  level  that  i  is  on.  Thus,  m,*  indicates  the  maximal  number  of  “steps”  that 
process  i  has  to  remain  on  its  level  until  it  is  “released”  via  another  process 
that  enters  its  level  and  thus  becomes  last.  Note  that  if  last{i)  A  -^Bi,  the 
summation  adds  n  for  each  process  j  ^  i  for  which  in[j]  =  in[i].  This  is 
because  process  j  needs  to  advance  n  levels  before  it  can  release  process  i. 

Clearly,  m,  is  always  greater  or  equal  to  0.  Moreover,  Am»  =  0  implies  Bi 
where  -^boti  is  known  to  hold  during  await  Bi  using  Lemma  8.5.  For  natural 
numbers  m  and  n  with  m  <  n,  let  DJJ,  denote  the  trace  set  in  which  every 
process  j  with  m  <  j  <  n  and  j  ^  i  eventually  either  forever  blocks  at  a  Bj- 
synchronization  statement  or  forever  stays  in  the  non-critical  region  executing 
V:[boij,  botj]  where  V  =  Var\{injast}.  More  precisely,  let 

To  establish  the  third  condition  we  need  to  show  that 

1. 


C-j-t  {inv*mi ;  AmiY  V  (inu’m,- ;  j  -D" , 

2.  has  no  execution  along  which  is  true  infinitely  often. 

We  prove  the  first  item  by  induction  over  n. 

Base:  n  =  1.  We  have 

Cj  C7-1  {inv*mi  ;  ^  (jnt;*m< ;  Amt)*  ; 

for  all  j  ^  i  where 

LPj  ~  inv*mi ;  {{-iBj}^  V  V:[botjj 

To  see  this  let 

Aj  = 


in[j]Jast[inlj]  -h  1]  :=m[j]  +  1,  j 
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and  observe  that 


Aj  C-j-t  Ami  (^-3) 

m[j]:=0  Cy-t  Ami 

for  all  j  ^  i.  The  remaining  atomic  statements  A  in  Cj  always  leave  m^ 
unchanged,  that  is,  A  Cjt  inv*mi.  Moreover,  every  trace  a  of  Cj  falls 
into  one  of  three  categories: 

1.  The  body  of  Cj  is  executed  infinitely  often.  Thus,  rrii  is  reduced 
infinitely  often.  In  this  case, 

a  €  T^lf^inv'mi  \AmiY\ 

2.  The  body  of  Cj  is  executed  only  finitely  often,  because  execution 
eventually  gets  blocked  forever  at  the  -synchronization  statement 
in  its  entry  protocol,  that  is,  after  a  finite  number  of  reductions  of 
mj,  a  ends  in  {-^Bj}^ .  In  this  case, 

a  e  T^l{inv*mi  ^Arm)*  ;  inv*mi  ;  {-^Bj}^}. 

3.  The  body  of  Cj  is  executed  only  finitely  often,  because  the  non- 
critical  section  never  terminates,  that  is,  after  some  finite  number  of 
reductions  of  rrii,  the  non-critical  region  ncj  is  executed  forever.  Due 
to  Lemma  8.5,  predicate  botj  always  holds  during  ncj.  Thus,  in  this 
case. 


a  G  T^l{inv*mi  ;  Arm)*  5  inv*mi  ;  V:[botj,  botj]"^}. 

To  see  (8.3)  we  show  c/aj  =>  <^fAm  ^hen  use  Lemma  2.2.5.  Let  (s,  s^) 
cfAj  •  If  =  0  in  5  then,  also  m*  =  0  in  s'  since  Aj  preserves  botiV-ylast{i), 
Otherwise,  if  =  ?;  >  0  in  s,  then  Aj  brings  j  one  level  closer  to  i  while 
leaving  the  distances  of  other  processes  unchanged.  Thus,  m*  <  -y  in  s'. 
Consequently,  (s,  s')  |=  t)oth  cases.  This  concludes  the  base  case. 

Step:  n  =  n'  +  1,  Let  n'  +  1  ^  i.  By  induction  hypothesis,  we  have 

\\jLi  j^iCj  Cq-t  {inv*mi  V  {inv*mi 

We  show  that  for  every  trace  a  of  and  every  trace  /?  of  Cn'+i, 

we  have 

a  II/?  C  T^l{inv*mi-.,AmiYy{inv*mi',Ami)*\D'l'^% 

Case  1:  a  trace  of  [inv*mi  ;Ami)^- 

Subcase  1.1:  /?  trace  of  (inv* rrii  ;^rni)^-  Then,  clearly,  all  traces  in 
o  11  /?  are  also  in  {inv*mi  ;Arm)^. 
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Subcase  1,2:  P  trace  of  {inv* rrii  ;  Arm)*  5  Since  Cjx 

imPmi,  and  the  parallel  merge  operation  as  defined  in  Sec¬ 
tion  2.2.1  is  weakly  fair,  all  traces  in  a  ||  /?  are  also  in  ; 

Case  2:  a  trace  of  {inv*mi  ;  ;  D”  . 

Subcase  2,1:  P  trace  of  We  argue  as  in  Subcase  1.2. 

Subcase  2,2:  P  trace  of  Then,  either 

P  G  {inv*mi:,AmX 

or 

P  G  {inv*mi  ;Arm)*  ;  V:[botjj  botj]^. 

In  both  cases,  all  traces  in  a\\P  are  also  in  {inv^mi'^Am.)* 

This  concludes  the  induction. 

We  now  need  to  show  that  has  no  execution  along  which  is  true 
infinitely  often.  We  will  do  this  by  contradiction.  Let  a  be  an  execution  of  D”. 
Thus,  a  =  aia2  where  ai  G  inv*mi  and 

Note  that  a2  has  the  following  property:  If  or  botj  is  true  in  some  state 
along  a2,  then  it  remains  true  forever.  Together  with  fairness  this  implies  that 
there  is  a  state  s  along  a  such  that  every  process  j  ^  i  is  either  blocked  at 
an  -synchronization  statement  or  executing  its  non-critical  region,  that  is, 
s  satisfies  /\^-i  V  botj).  For  a  contradiction,  eissume  that  is  true 

infinitely  often  along  a.  Then,  s  must  be  followed  by  a  state  s'  which  satisfies 

V  6otj).  (8.4) 

Let  J  be  the  set  of  processes  such  that  in  s',  that  is,  J  is  the  set  of  processes 
that  are  not  the  highest  and  also  the  last  on  their  level.  Formally,  j  G  J  iff 

s'  ^  -^highest[j)  Alast{j), 

(Note  that  J  is  non-empty  since  '^Bi  holds  by  assumption.)  Remember  that 
Lemma  8.4  showed  that 

P2  =  VI  <  z  <  n,in[i]  >  0  m[/a5<[in[^]]]  =  in[i] 

is  an  invariant.  By  P2  we  have  that  Vj  G  J,last(j)  implies  that  no  two  processes 
in  J  are  on  the  same  level,  that  is,  Vj,/  G  J.j  ^  in[j]  in\j'],  (Note 
that  P2  and  in[j]  =  in[j^]  for  some  j  and  /  imply  “i/as^(j)  V  -ilast{f),)  Con¬ 
sequently,  one  of  the  processes  k  in  J  must  be  higher  than  all  other  processes 
in  J,  Moreover,  all  processes  not  in  J  are  on  level  0  due  to  (8.4).  Thus,  k  is 
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the  highest  of  all  processes,  that  is,  highest{k)  for  some  k  E  J.  However,  this 
contradicts  ^Bk.  Thus,  all  conditions  of  Lemma  8.2  are  satisfied  and  we  can 
conclude  that  TIE]^^  has  the  eventual  entry  property. 

We  thus  have  formally  proved  the  correctness  of  our  abstract,  coarse-grained 
representation  TIE\^^^^.  In  the  next  section  we  will  begin  to  refine  this  repre¬ 
sentation. 


8.7  Refining  coarse-grained  algorithms 

Having  verified  TIE\^  how  can  we  verify  TIE^  the  algorithm  from  Exam¬ 
ple  8.1,  while  minimizing  the  amount  of  additional  verification  work?  This 
section  will  present  rules  which  allow  the  correctness  preserving  refinement  of 
high-level  representations  into  low-level  ones.  We  will  illustrate  the  use  of  these 
rules  by  refining  into  TIE. 

8,7. 1  Refinement  using  program  assertions 

Consider  TIE^^  There  is  a  correspondence  between  in[i],  the  level  process 
i  is  in,  and  the  loop  counter  1.  For  instance,  in  Lemma  8.5  we  showed  that 
I  z=  in[i]  -h  1  holds  during  in[i]Jast[in[i]  -h  l]:=m[2]  -f  l,i  in  TIE\^  q^.  We 
now  want  to  use  this  correspondence  to  replace  expressions  containing  in[i\  by 
expressions  containing  /.  This  refinement  will  improve  readability  and  efficiency 
of  the  code. 

Proposition  8.2  (Refining  in*® 

Let  Di  be 

Di  =  in[i]Jast[in[i]  -1-  1]  :=in[i]  +  1, 

await  yj  /  i.in[j]  <  in[i]  V  last[in[i]]  7^  i 

and  let  Ei  be  such  that  TIE\^  is  of  the  form 

=  new  m[l..n]  =  0,/a5t[l..n]  =  0  in  ||7_i 

Let  TIE\^^a^  be  as  TIEl^  ^t  except  that  every  occurrence  of  Di  is  replaced  by 
D'^  where 

Di  =  m[z], /asf  [/]:=/, 

await  Vj  7^  i^in[j]  <  I  V  last[l]  7^  z, 

that  is. 


=  new  m[l..n]  =  0,Zas<[l..n]  =  0  in  llf^i 
Then,  =rx  TlEl^,^. 
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Proof:  We  prove  this  proposition  using  straight-forward  program  refinement. 
Let  Ai  and  A2  be  the  two  atomic  statements  in  Di^  that  is, 

Ai  =  m[f],  lasi[in[i]  -h  1]  :=m[z]  H- 1, 2 

A2  =  await  Vj  ^  iAn[j]  <  in[i]  V  last[in[i]]  ^  i 

and  let  A[  and  A2  be  the  two  atomic  statements  in  that  is, 

A[  =  in[i]Jast[l]:sl^i 

A2  =  await  Vj  <  /  V  lasi[l]  /  L 

We  derive 

Til  =  [v  =  in[i]  +  1,  Preds{{in[i]})] 

(Ai;A2)[v/l] 

~  ATOM,  SEQ 

iA[;A'2)[v/l] 

[v  =  in[i],  Preds{{in[j]  \  j  2})] 

for  all  1  <  t;  <  n  —  1.  We  now  obtain  a  refinement  of  the  for  loop. 

7^2  =  [boti,Pr€ds{{in[i]})] 

for  /  :=  1  to  n  -  1  do  A 

^  FOR(7ii) 

for  /  :=  1  to  n  —  1  do  A 
[topi,  Preds{{in[j]  \  j  t})] . 

The  sequential  composition  of  critical  section,  exit  protocol,  and  non-critical 
section  satisfies  the  following  assumption-commitment  formula. 

7^3  =  [iopi,Preds{{in[i]})] 

cvi ;  in[i]  :=0  ;  nc,-  Lemma  3.4 

[boti,Preds{{in\ji]  \  j  ^  i})] 

That  is,  if  top^  initially  and  the  environment  does  not  change  2n[2],  we  have 
boti  upon  termination  and  in[j]  is  unchanged  for  all  j  ^  2.  Remember  that  cr,- 
and  nci  are  well-formed  by  assumption,  that  is, 

cn  inv*{injast} 

nci  Cjt  inv^  {in,last}. 

Both  refinements  are  sequentially  composed  and  embedded  in  the  while  loop. 
7^4,,'  =  [boti,Preds({in[i]})] 

Ci^Ci 

[«,  Preds{{in[j]  \  j  #  i})] 


SEQ(R3,7Zs).  WHILE 
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where  Ci  =  Ei[Di]  and  C-  ~  Ei[D^].  Now  all  n  processes  can  be  put  in  parallel. 
7^5  =  [/\^_^boti,  Preds{{in[i]  |  1  <  2  <  n})] 

\\i=iCi  -  ||r=iCf  PAR-V-N(7^,,,) 

[it,  Pre(is(0)] . 

Finally,  declaring  the  arrays  in  and  Iasi  gives  us  the  desired  refinement. 


[tt^  Preds{9)] 


TIE 


1 

at^at 


EiWUCi] 

^  NEW(7e5,m[l..n],/a52[l..n]) 

E[\\UC'^ 

TIEl^at 

\tt^  Preds{^)\ 


where 


E  =  new  m[l..n]  = 
With  Lemma  5.5  this  implies  TIE^^ 


0,  /as^[l..n]  =  0  in  [] . 
=rt  TlElt  af 


The  above  refinement  not  only  improves  the  readability  but  also  the  effi¬ 
ciency  of  the  algorithm,  because  the  double  evaluation  of  the  expressions  in[i] 
and  in[i]-^l  is  avoided.  However,  as  we  will  see  later,  it  also  increases  generality 
of  the  algorithm  in  the  sense  that  it  enables  certain  otherwise  impossible  further 
refinements. 


8.7.2  Refining  synchronization  statements 


We  now  describe  under  what  conditions  unimplementable,  coarse-grained  syn¬ 
chronizations  like 

await  Vj  ^  idn[j]  <  /  V  last[l]  ^  i 


can  be  replaced  by  an  implement  able,  fine-grained  sequence  of  synchronizations 
like 

for  j  :=  1  ton  st  j  ^  i  do 
while  in[j]  >  I  Alasi[l]  =  i  do  skip 


or 


while  in[j]  >  /  A  last[l]  =  i  do  skip. 


Refinement  Rule  8.1  (Refining  synchronization  statements) 
Let  R  be  a  predicate.  Then, 


await  B  —j-x  while  do  skip. 
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Proof:  We  have 


await  B  =  {B}  V 


and 


while  -»jB  do  skip  =  ;  skip)*  ;  {B}  V  ({->5}  ;  skip)^^ 

=r,  {-5r;{B}VbSr. 

Since  {B}  ='ft  ;  {B]  due  the  closure  conditions,  the  result  follows.  ■ 

Note  that  the  above  equivalence  depends  on  the  atomic  evaluation  of  loop 
conditions.  The  next  proposition  applies  the  above  refinement  rule. 

Proposition  8.3  (Refining  into 

Let  Ei  be  such  that  TIE^f^  is  of  the  form 

TIElt  ^j^f,  =  new  m[l..n]  =  0,/ast[l..n]  =  0  in 

||JLi£', [await  Vj  ^ 

where  Bij  =  in[j]  <  I  V  la$t[l]  ^  i.  Let  TIElt  ^t  be  as  TIE\^  except  that 
every  occurrence  of 

await  Vj  ^ 

is  replaced  by 

while  3j  ^  i‘'~^Bij  do  skip. 

Formally, 

=  new  m[l..n]  =  0,lasi[l..n]  =  0  in 

1 1  ?=  1-^1  [while  3j  ^  do  skip]. 

Then, 

=r*  TlE\,^av 

Proof:  Direct  consequence  of  Refinement  Rule  8.1  and  congruence  (Lemma  2.2.6) 


Each  parallel  process  Ci  in  TIE^^^  ^^^  contains  a  single  synchronization  state¬ 
ment 

while  3j  ^  i^-'Bij  do  skip. 

We  want  to  refine  this  synchronization  statement  by  a  sequence  or  a  parallel 
composition  of  synchronization  statements.  The  following  refinement  rule  will 
allow  us  to  do  this. 
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Refinement  Rule  8.2  (Refining  synchronization  statements) 

Let  be  predicates.  Assuming  that  the  environment  preserves  each 

of  the  Bi ,  the  synchronization  statement 

while  3j  /  i.-^Bj  do  skip 

can  be  refined  into  either  a  sequence  or  a  parallel  composition  of  simpler  syn¬ 
chronization  statements.  More  precisely, 

[tt,  {Bj  I  1  <  i  <  n}] 

while  3j  /  i.-iBj  do  skip 

for  j  :=  1  to  n  st  j  ^  i  do 
while  -iBi  do  skip 

[tij  Pr€ds{  Var)] 

and 

[tt,  {Bj  I  1  <  i  <  n}] 

while  3j  ^  i.^Bj  do  skip 

for  j  :=n  to  i  st  j  /  i  do 
while  -iBi  do  skip 

[tt,  Preds{  Var)] 

and 

[tt,  {Bj  I  1  <  j  <  n}] 

while  3j  /  i.-iBj  do  skip 

while  ^Bj  do  skip 

[tt,  Preds{  Var)] 

Proof:  See  Section  AAA,  page  262.  ■ 

Proposition  8.4  (Refining  into  TIE  par, at  TIE  up, at  and 

TIE  down, at) 

Let  Ei^i  be 

Ei^i  ~  in[i],last[l]:zzl,i‘ 

[] 

and  let  Ei  be  such  that  TIE^^^^f.  is  of  the  form 

=  new  m[l..n]  =  0,last[l..n]  =  0  in 

||^^iE'i[£;i,i[while  3j  ^  i.^Bij  do  skip]] 

where  ^Bij  =  in[j]  >  I  A  last[l]  =  i. 
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1,  Let  TIE  par, at  be  as  except  that  every  occurrence  of 

while  3j  ^  j  do  skip 


is  replaced  by 


Formally, 


while  j  do  skip. 


TIEpar,at  =  H^w  m[l..n]  =  0,/asf[l,,n]  =  0  in 

while  -^Bij  do  skip]]. 

Then,  TIEpar,at  =rt 

2.  Let  TIEup,at  be  as  TIE^^  except  that  every  occurrence  of 

while  3j  j  do  skip 

is  replaced  by 

for  j  :=1  to  n  st  j  ^  i  do  while  j  do  skip. 

Formally, 

TIE  up, at  =  new  m[l..n]  =  0,/a5<[l..n]  =  0  in 

Wi^iEi  [jE',',i[for  j :=1  to  n  st  j  ^  i  do 
while  ^Bij  do  skip]]. 

Then,  TIE ^p, at  “T* 

3.  Let  TIEdown,at  be  as  TIEl^  ^t  except  that  every  occurrence  of 

while  3j  ^  i^^Bij  do  skip 

is  replaced  by 

for  j  :=n  to  1  st  j  ^  i  do  while  -'Bij  do  skip. 

Formally, 

TIE  down, at  =  new  in[l..n]  =  0,  /ast[l..n]  =  0  in 

[F;'i,i[for  j :=n  to  1  st  j  ^  i  do 
while  ~*Bij  do  skip]] . 

Then,  TIEdown,at  '^^^at,at' 

Proof:  1)  For  all  1  <  i  <  n  we  have 

[tt,  {Bij  I  1  <  i  <  n}] 

while  3j  ^  i‘^Bij  do  skip 

while  j  do  skip 

[tty  Pr€ds{  Var)] 
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due  to  Refinement  Rule  8.2.  By  ATOM,  SEQ,  and  FOR 

Ci  ^  C-  ATOM,  SEQ,  FOR 

[it,  {Bkj  \  1  <  k,l  <  n  A  k  i]] 

where 

Ci  =  £*2  [E'2^i[while  3j  ^  do  skip]] 

C-  =  £'i[£'i4[||”_iwhile -iBij  do  skip]] 
for  all  1  <  i  <  n.  Then,  by  taking  the  parallel  composition 
TZt  =  [U,{Bi,j\l<i,j  <n}] 

WiCi  ~  WiC'i  PAR-V-N(7ee,i) 

\tt,  Preds(0)] 

and  declaring  the  local  variables 
7^8  =  [it^  Preds(0)] 

TIE^^  ^  TlEpar,at  NEW(7e7,jn,/asf) 

[it,  Preds(0)] . 

The  desired  trace  equivalence  follows  from  Us  with  Lemma  5.5. 

2)  and  3)  are  analogous  to  previous  case.  ■ 

8.7.3  Refinement  by  increasing  granularity 

We  will  now  replace  the  coarse-grained  multiple  assignment 

Ai  =  in[i],last[l]  :-l,i 

by  two  finer-grained  statements.  In  contrast  to  the  previous  refinements,  this 
one  will  destroy  the  invariant  P  of  Lemma  8.4.  Note  how  the  simultaneity  of 
the  updates  of  in[i]  and  last[i]  is  crucial  for  the  proof  of  Lemma  8.4.  More 
specifically, 

•  if  Ai  was  replaced  by  A[  =  m[2]  :=/ ;  la$t[l]  :=i,  then 

[PABi,{p}urp-}]  [«,rjv-{i}] 
would  not  hold,  and 

•  if  Ai  was  replaced  by  A^/  =  last[l]  :=i ;  in[i]  :=/,  then  neither 

[PABi,{P}urp}]  A'f  [tt,{Pi]] 

nor 

[PAPi,{P}urp}]  A!!  [U,{P2}] 

would  be  valid. 
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In  both  cases,  P  of  Lemma  8.4  ceases  to  be  an  invariant.  Note  that  P  is  used 
not  only  to  prove  mutual  exclusion  but  also  eventual  entry.  In  contrast  to  some 
approaches  in  the  literature  [OG76a,  And91,  A091]  we  will  not  attempt  to  find 
a  new  invariant  using  auxiliary  variables  or  location  predicates.  Instead,  we  will 
apply  correctness- preserving  transformation  laws  which  will  now  be  developed. 

To  start  us  off,  we  consider  a  special  case  of  Lemma  2.1.7  from  page  20. 
Remember  that  a  context  is  sequential  if  its  hole  is  not  in  the  scope  of  a  parallel 
composition. 

Lemma  8.6  (Increasing  granularity  I) 

If  JS'  is  a  sequential  context  and  neither  xi  nor  X2  occur  free  in  C,  then 

new  xi  -  Vo^i,X2  =  vo,2  in  E[xi,X2:=vi,V2]  ||  C] 

=jx  new  xi  =  v^^i,X2  -  t;o,2  in  \E[x\:^vi  ||  X2*“V2]  ||  C]. 


□ 

This  rule  is  not  appropriate  for  refining  TIE^^^^^,  because  both  in[i]  and 
last[l]  occur  free  in  the  parallel  context  C.  More  precisely,  both  variables  occur 
free  in  synchronization-statements  in  the  parallel  context.  To  encompass  this 
situation,  we  modify  the  above  rule. 

Refinement  Rule  8.3  (Increasing  granularity  II) 

Let 


E  =  new  xi  =  vo,i)^2  =  ^0,2  in  E^ 

where  £"  =  E”  ||  C  for  some  sequential  context  and  some  program  C. 
Consider  the  atomic  statement  xijX2:-vijV2-  If 

1.  C  mentions  xijX2  only  in  stuttering  statements  {R}, 

2.  for  all  stuttering  statements  {J5}  in  C  we  have 

♦  if  [s\xi  =  vi]  \=  R,  then  either  s  ^  R  or  [s|xi  =  =  V2]  |=  R, 

and 

•  if  [s|a;2  =  V2]  ^  R,  then  either  s  [=  R  or  [s\xi  =  vi,X2  ^  V2]  t=  R 
for  all  states  s, 

then  E[xijX2:=^viyV2]  =7-*  E[xi:=vi  \\x2:=V2].  □ 

Replacing  xiyX2  :=viyV2  by  xi  i^vi  ||  X2  :-V2  creates  two,  possibly  new  inter¬ 
mediate  states:  [s\xi  =  vi]  and  [s\x2  =  V2]-  Intuitively,  condition  2  expresses 
that  whenever  one  of  the  intermediate  states  makes  R  true,  then  either  the  state 
right  before  or  the  state  right  after  both  assignments  also  make  R  true.  Due  to 
this  condition,  for  instance,  the  multiple  assignment  xi,  ar2 :=1, 1  in 

new  xi  =  0,X2  =  0  in 

[ari,a;2:=l,l  II  {xi  =  1  Aa;2  =  0}] 
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cannot  be  replaced  by  2^1:=!  ||  because  the  refinement  would  introduce 

an  execution  (the  above  program  has  no  executions).  However,  in  program 

new  xi  =  0,X2  =  0  in 

[xi,X2:=l,  1 II  {xi  =  1  Va;2  =  0}] 

the  same  multiple  assignment  can  be  replaced  by  :=1 1|  a?2  •=!•  Unfortunately, 
the  above  rule  still  is  not  applicable  to  TIE^^  Suppose  the  parallel  environ¬ 
ment  contains  a  process  j  that  is  entering  level  1.  Then,  variable  lasi[l]  will 
also  be  written  by  process  j.  The  variables  in[i]  and  last[l]  in  T1E%  are  thus 
accessed  as  follows: 

in[i]  is  written  only  by  process  but  read  by  the  parallel  environment, 

lasi[l]  is  written  by  process  i.  The  parallel  environment  also  reads 

and  writes  it  in  constant  assignments^  that  is,  assignments  of  the  form 
last[l]  :=c  where  c  is  a  constant. 

The  following  rule  addresses  the  situation. 

Refinement  Rule  8.4  (Increasing  granularity  III) 

Let 


E  =  new  xi  =  vo^i,X2  =  vo^2  E' 

where  E^  =  ||  C  for  some  sequential  context  and  some  program  C. 

Consider  the  atomic  statement  xi^X2'-vi,V2^  If 

1.  C  mentions  xi  only  in  stuttering  statements  {R},  and 

2.  C  mentions  X2  only  in  stuttering  statements  {R}  and  on  the  right-hand 
side  of  constant  assignments,  and 

3.  for  all  stuttering  statements  {R}  in  C  we  have 

•  if  [s\xi  =  -yi]  t=  R,  then  5  |=  R 
for  all  states  5, 

then  E[xi,X2:=vi,V2]  =rt  R[a?i  :=i’i  ;  :=t'2]- 

Proof:  See  Section  A. 4. 5,  page  264.  ■ 

The  intuition  behind  this  rule  is  as  follows.  Right  after  the  multiple  as¬ 
signment  we  have  iCi  =  A  2:2  =  '^2  and  thus  also  ^2  =  “^2  ==  The 

environment  may  change  the  value  of  0^23  but  will  leave  xi  unchanged.  Conse¬ 
quently,  the  implication  ^2  =  t^2  is  preserved  by  the  environment. 

Moreover,  the  implication  holds  in  the  intermediate  state  after  executing  xi  :=vi 
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but  before  executing  X2  :=U2.  Reversing  the  order  of  the  two  assignments  in  gen¬ 
eral  does  not  yield  a  correct  refinement,  because  the  implication  does  not  hold 
in  the  intermediate  step.  For  illustration,  consider  the  following  program 

C  =  new  r  =  0,  done  =  it  in 


■ 

x:=5; 

await  -^done\ 

done  := ff ; 

r, done :=x  •  x^it 

await  done; 

await  r  ^  25 

In  C,  r,  done:sX‘X^ti  can  be  replaced  by  r:=a?  a:;donc:=t:^  without  changing  the 
set  of  executions  of  C,  because  we  have  done  =>  r  =  25  in  the  intermediate  and 
final  states.  However,  replacing  r,  done  :-x  x,tthy  done  :^tt ;  r  :=ar  •  x  changes 
the  set  of  executions  of  C,  because  the  flag  done  does  not  indicate  r  carrying 
the  result  anymore. 

Note  that  Refinement  Rule  8.4  also  is  applicable  if  the  introduced  assign¬ 
ments  xi:-vi  and  X2:=U2  are  not  atomic.  Unfortunately,  the  applicability  of 
this  rule  is  still  limited  because  xi  and  X2  can  only  be  mentioned  in  stuttering 
steps  {B}  by  the  parallel  environment.  To  be  applicable  to  the  tie-breaker  al¬ 
gorithm,  for  example,  the  environment  must  be  allowed  to  mention  xi  and  X2 
in  synchronization  statements.  We  thus  want  to  generalize  the  rule  to  accom¬ 
modate  this  situation.  However,  this  poses  the  following  problem.  Suppose,  for 
instance,  that  (xi  :=ui  ;x2 1-V2)  is  executed  infinitely  often  along  some  execution 
a  of  B'[(xi  :=r;i  ;  X2:=i^2)]  and  that  C  contains  a  B-synchronization  statement 
await  B  such  that  B  is  always  false  when  control  resides  between  the  two  as¬ 
signments.  Also  assume  that  B  is  always  true  right  after  both  assignments 
have  been  executed.  This  means  that  ;  X2:-V2)  in  context  B'  offers 

infinitely  many  states  along  a  with  -iB  whereas  {xi^X2'~vi^V2)  does  not  nec¬ 
essarily.  Consequently,  it  may  be  the  case  that  await  B  is  blocked  forever  in 
£"[(xi  :=ui  ;X2:=U2)]  but  not  in  B'[(xi,X2:=ui,t;2)].  As  an  illustration,  consider 
the  program 

new  xi  =  0,  X2  =  0  in 


while  tt  do 

a:i,a:2:=0,0; 

2:i,a:2:=l,l 

od 

await  xi  =  X2 

Replacing  xi,X2:=l,l  by  xi:=l  ;X2:=1  would  introduce  an  infinite  execution 
in  which  the  right  program  never  terminates.  To  circumvent  this  problem, 
we  require  that  both  the  refined  and  the  refining  program  have  the  eventual 
entry  property,  that  is,  none  of  their  synchronization  statements  is  ever  blocked 
forever.  We  now  finally  arrive  at  the  refinement  rule  that  will  allow  us  to  refine 

Refinement  Rule  8.5  (Increasing  granularity  IV) 

Let 


=  new  xi  =  uo,i,X2  =  uo,2  in  B' 


B 
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where  E'  =  E^^  \\  C  for  some  sequential  context  E^'  and  some  program  C. 
Consider  the  atomic  statement  XijX2:-VijV2.  If 

1.  C  mentions  xi  only  in  stuttering  statements  {B],  and 

2.  C  mentions  X2  only  in  stuttering  statements  {B}  and  on  the  right-hand 
side  of  constant  assignments,  and 

3.  for  all  stuttering  statements  {5}  in  C  we  have 

♦  if  [s\xi  =  -yi]  1=  B,  then  s  \=  B 
for  all  states  s,  and 

4.  E^[xi^X2:=Vi,V2]  and  E'lxii-vi  ||  X2'^-V2]  have  the  eventual  entry  prop¬ 
erty, 

then  E[xi,X2-=^vi,v:^  =£t  E[xi:zzvi  ;X2:=V2]. 

Proof:  See  Section  A.4.6,  page  266.  ■ 

With  the  help  of  the  above  rule,  each  of  the  programs  TIE  par, at  ^  TIE  up, at  ^ 
and  TIE  down, at  can  now  be  refined  by  replacing 

in[i]ylast[l]  :=/,  i 

by 

in[i]  :=/ ;  last[l]  :=i. 

Proposition  8.5  (Refining  TIEpar,at,  TIEup,at,  and  TIEdown,at) 

Let  Ei  be  as  in  Proposition  8.4. 

1.  Let  Ei^i  be 

Ei,i  = 

while  m[j]  >  /  A  last[l]  =  i  do  skip; 

Then,  TIEpar,at  is  of  the  form 

TIE  par, at  =  new  m[l..n]  =  0,/as^[l..n]  =  0  in 

\\^.,Ei[Ei^i[in[i],last[l]  :=l,  i]]. 

Let  TIEpar,par  be  as  TIEpar,at  except  that  every  occurrence  of 

in[i]Jast[l]  :=/,  i 

is  replaced  by 

m[2]  :=/ ;  lasi[l]  :=i. 

More  formally, 

TIE  par, par  =  new  m[l..n]  ==  0,/as^[l..n]  =  0  in 

\\'^^,Ei[Ei,i[in[{\:=h,last[l]:=i]]. 


Then,  TIEpar,par  — ']~t  TIE  par, at* 
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2.  Let  £■,.  2  be 
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u,  „.  T  ■ 

"'■"  *’‘“P‘  “■"  ««y  occurrence  of 
»"M, /a5t[/J. •*/,*• 

IS  replaced  j|g> 

Moreforru^  „oe, 

T/r  _ 

Then,  T’/ff  "  ’  [^•-2f»«W  :=/  ;last[/]:.ij]. 

Let  3 

^•■3  =  O; 

Then.  TIE^  ^  =  a'  do  skip,. 

’  at  IS  of  the  form 
r/g,  - 

Let  r/^e  r.  "•'■®'l*.>l“W.'<.riW:.f,.l]. 

downer  he  as 

"“P‘  ‘ver,  occurrence  of 

IS  replaced  by 

Moreformnll,,  i'^H:..', 

Tie 

/I.=1  [£’,•, 2ftn[tJ  ;=/ .  /agf[/]  ._^JJ 


111  =  1- 

Then,  TTg^ 

^''o-n.par  =r» 
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Proof:  1)  Refinement  Rule  8.5  is  applied  n  times  where  the  application 
establishes 

TIEpar,at{i)  =  new  m[l..n]  =  QJasi[l,.n]  :=  0  in 
\\l~JiEk[in[k]  :=/  ;/asf[/]  :=A;]|| 

Ei[in[i]  :=/ ;  last[l]  :=i]|| 

\\k=i+i^k[in[k]Jast[l]  :=/,  k] 

=jt  new  m[l..n]  0,  last[l,,n]  =  0  in 

\\k~Ji^k[in[k]  :=l ;  last[l]  :=/?]|| 

Ei  [m[i] ,  Iasi  [/]  :=/,  i]  \  \ 

\\k=i-\.i^k[i'^[k]JcLst[l]  :=/,  k] 

=  TIEpar^ati}  ~  !)• 

Consider  the  first  two  conditions  of  Rule  8.5  where  C  is  the  program  that 

Ei[in[i]^  last[l]  :=/,  i] 


is  executing  in  parallel  with. 

C  =  \\)^iEk[in[k]:=l  ilasi[l]:=k]\\ 

\\k^i+i^k[in[k]Jast[l]  :=/,  k]. 

Obviously,  C  mentions  in[i]  and  last[l]  only  in  synchronization  statements.  Also, 
for  every  Rj-synchronization  statement  in  C,  if  [s|m[i]  =  /]  [=  Bj,  then 

[s|m[i]  =  ljast[l]  =  i]  \=  Bj 

and  if  [s\lasi[l]  =  i]  |=  Bj^  then 

[s|m[i]  =  ljast[l]  =  i]  [=:  Bj. 

Suppose  TIEpar,at{i  —  1)  was  shown  to  satisfy  the  eventual  entry  property 
using  Lemma  8.2.  Then,  with  the  observation  that  in[j]:=l  Arm 

last[l]:=j  C'j't  Arm  ^  sind  last[l]:-j  Cq-t  inv*mi  if  /  ^  in[i],  the 

same  argument  also  applies  to  TIEpar,at{^)‘  Since  TIEpar,at{0)  =  TIEpar.at 
and  TIEpar,at{f^)  =  tie  par, par  j  we  get  the  desired  result  by  transitivity. 

2)  and  3)  As  above.  ■ 

Note  how  the  introduction  of  /  was  necessary  for  the  refinement  of  the  mul¬ 
tiple  assignment.  In  other  words,  m[i], /asf[m[z]  4-  l]:=m[2]  -|-  l,i  cannot  be 
replaced  by  in[i]  :=m[z]  4-  1 ;  last[in[i]  4-  1]  :=L 

8,7.4  Putting  everything  together 

Figure  8.3  gives  an  overview  of  the  refinements  presented  in  this  section.  The 
correctness  of  the  most  abstract  version  TIE^f.  (Proposition  8.1)  and  Lemma  8.3 
imply  the  correctness  of  all  refinements. 
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Figure  8.3:  Overview  of  implementations  of  the  tie-breaker  algorithm 


Corollary  8.2  (Correctness) 

All  programs  in  Figure  8.3  are  correct.  □ 

TlEup  is  equivalent  to  TIE  of  Example  8.1  and  we  thus  have  achieved  the 
verification  of  TIE  which  was  one  of  our  original  goals. 


8.8  Fine-grained  concurrency 

A  very  important  property  of  the  the  tie-breaker  algorithm  is  that  it  places  no 
atomicity  constraints  on  the  execution.  Its  correctness  is  completely  indepen¬ 
dent  of  the  level  of  granularity  of  the  parallel  execution.  Note  that  neither  the 
bakery  nor  the  ticket  algorithm  have  this  property.  In  this  section  we  briefly 
sketch  how  the  refinement  of  TIE\^  would  proceed  if  the  evaluation  of  boolean 
expressions  in  while  statements  is  not  atomic.  Not  surprisingly,  dropping  the 
atomicity  requirement  complicates  the  refinement.  Refinement  Rules  8.1  and  8.2 
become  unsound  and  need  to  be  replaced.  We  first  show  under  what  conditions 
a  stuttering  step  in  B  can  be  replaced  by  the  non-atomic  evaluation  of  jS. 
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8,8,1  Refining  non-atomic  boolean  expressions 

For  the  purposes  of  this  section,  let  for  1  <  j  <  m  and  I  <  k  <  n 
range  over  atomic  propositions,  that  is,  non- composite  boolean  expressions  that 
are  always  evaluated  in  one  single  step.  In  accordance  to  the  treatment  in 
Section  2.4,  we  will  assume  that  the  evaluation  of  the  negation  also  is 

atomic.  Furthermore,  we  assume  that  conjunction  Bj^i  A  Bj^2  and  disjunction 
Bj^i  V  Bj  2  are  evaluated  in  some  fixed,  but  unknown  order.  The  following 
lemma  shows  under  which  conditions  the  non-atomic  evaluation  of  the  boolean 
expressions  \/j  /\f^  Bj^k  and  f\’  \l ^  ~^Bj^k  is  equivalent  to  a  single  stuttering  step 
in  Vj  f\k  Aj  Mk  respectively. 

Lemma  8,7  (Non-atomic  expressions) 

If  the  environment  preserves  the  Bj^k  ?  then  the  non-atomic  evaluation  of  \/j/\k^j,k 
to  true  is  equivalent  to  a  single  stuttering  step  in  Vj 

[tt^  {Bj^k  \  I  <  j  <  m  Al  <  k  <  n}] 


{Vj JJ' 

[ttj  Preds{  Var)] . 

Moreover,  if  the  disjunctions  V/c  are  preserved  by  the  environment,  then 

the  non-atomic  evaluation  of  Aj  i®  equivalent  to  a  single  stut¬ 

tering  step  in  /\j\/j^-^Bj^k)  that  is, 

[ti,  {\/k^^J,k  I  1  <  ^ 

{AjV/c^'-^i,^} 


{AjV/:'~'-®i,Ar 

[</,  Preds{  Far)] . 

Proof: 

1.  Using  the  stuttering  closure  condition  we  can  show  that  every  trace  of 
{\/j/\k^j,h}  also  is  a  trace  of  {VjA/e^i,*  ^  ^0?  that  is, 

{VjA/e-^i,^:}  Qrt  {y jAk^jjk  ^ 

which  implies  one  direction  of  the  equivalence.  The  other  direction  follows, 
because  the  evaluation  of  \/j/\k^j,k  in  a  parallel  environment  that  pre¬ 
serves  all  atomic  propositions  Bj^k  for  1  <  j  <  m  and  I  <  k  <  n,  always 
eventually  passes  through  a  stuttering  step  that  satisfies  VjA/c-^j,^' 

2.  As  in  the  first  case,  the  stuttering  closure  condition  implies 

{/\j\/k^Bj,k}  Qt^  {/\j\/k~‘Bj,k  ii-tt}, 
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and  thus  one  direction  of  the  equivalence.  To  show  the  other  direction, 
we  have  to  argue  that  the  evaluation  of  'll-  in  a  parallel 

environment  that  preserves  all  disjunctions  1  ^  ^ 

ways  eventually  passes  through  a  stuttering  step  that  satisfies  Aj 
Suppose  that  during  the  evaluation  evaluates  to  true  in  some  state. 
Thus,  also  holds  in  that  state.  Due  to  the  assumptions,  this  dis¬ 
junction  thus  continues  to  hold.  Thus,  must  also  eventually 

be  true.  ■ 


8.8.2  Refining  non-atomic  synchronization  statements 

The  coarse-grained  synchronization  statement 

await  /\j\Jk~^Bj^k 

is  equivalent  to  the  fine-grained  synchronization  statement 

while 

if  the  environment  preserves  for  all  k  and  Bj^k  for  all  j  and  k. 

Refinement  Rule  8.6  (Equivalence  of  synchronization  statements) 

I  1  <  A:  <  n}  U  {Bj^k  \  I  <  j  <  m  M  <  k  <  n}] 
await  AjVfc-’-Sj.fc 

while  \/j/\k^j,k  do  skip 
[iiy  Pr€ds{  Var)] 

Proof:  We  show 

111  =  [ti,  {Bj^k  1  1  <  i  <  ^  A  1  <  A:  <  n}] 

^  Lemma  8.7,  OMEGA 

[it,  Preds(0)] 
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and 


7^2  =  [u,  I  1  ^  ^  ^ 

^  Lemma  8.7 

JJ-w} 

~  Lemma  2,3 

(Lemma  2.1) 

{Ajyk-Bj,k  ^  tty ;  {\/jAkBj.k  jf} 

[ttjPreds{il\)], 

The  desired  result  follows  with  an  application  of  OR  to  Hi  and  H2  and  the 
definitions  of  await  and  while.  ■ 

We  want  to  use  Refinement  Rule  8.6  to  refine  into  TIE\^  The 

synchronization  statement  of  process  i  in  TIE'^^  can  be  transformed  as  follows 

\fj  ^  i.in[j]  <  in[i]  V  last[l]  ^  i 
=  VI  <  j  <  n.j  ^  i  =>  V  last[l]  ^  i) 

=  /\]=ij^i(in[j]  <  in[{\  V  last[l]  ^  i) 

=  Aj  =  lj;,fiVLl“'^i,* 

where 

-'Bj^i  =  in[j]  <  in[i] 

^Bj^2  =  last[l]  ^  i. 

While  the  synchronization  statement  now  has  required  shape,  the  rule  is  not 
applicable,  because  neither  Bj^i  =  in[j]  >  in[i]  nor  Bj^2  =  lasi[l]  =  i  is  preserved 
as  required.  Predicate  in[j]  >  in[i]  is  invalidated  when  process  j  leaves  its  its 
critical  region  and  then  moves  to  level  0  by  executing  its  exit  protocol  in[j]  :=0. 
Predicate  lasi[l]  =  i  is  invalidated  when  some  other  process  k  also  reaches  level 
I  and  updates  the  last  pointer  by  executing  last[l]  :=k.  In  other  words,  the  rule 
is  too  weak  for  our  purposes,  because  its  assumptions  are  too  strong. 

Using  eventual  entry 

With  the  help  of  the  eventual  entry  property  we  now  develop  another  refinement 
rule  that  will  allow  us  to  refine  T1E\^  The  definitions  of  await  B  and 
while  “iR  do  skip  each  consist  of  two  disjuncts,  the  second  of  which  deals 
with  the  case  where  B  never  becomes  true.  Recall  that  Corollary  8.1  shows 
that  if  program  C  has  the  eventual  entry  property,  then  these  disjuncts  can  be 
removed  without  changing  the  executions  of  C.  Consequently,  in  this  case  only 
the  first  disjunct  needs  to  be  refined.  The  following  rule  applies  this  idea.  Note, 
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however,  that  eventual  entry  is  a  property  of  the  entire  program.  Consequently, 
compositionality  is  compromised. 

Refinement  Rule  8.7  (Equivalence  of  synchronization  statements) 

Let  C  be  of  the  form 

C  =  £'[await  II  D 

where  E  is  sequential.  If  the  parallel  context  D  preserves  1  S 

j  <  n,  that  is, 

[tij  Preds{  Var)]  D  [ii^  I  1  <  i  ^  w}] 

and  C  has  the  eventual  entry  property,  then 

C'  =  jEfwhile  do  skip]  ||  D 

also  has  the  eventual  entry  property  and 

C  =et  C', 

Proof:  We  have 

JE;  [await  AjVfc"’-8j.fc]  II  ^ 

Corollary  8.1 

^[{AiVfc-5,- fc}]  II D 

Lemma  8.7 

\\D 

=•£%  (Lemma  2.1) 

^  ffV ;  a- «}]  II D 

Corollary  8.1 

E'[while  Vj do  skip]  ||  D. 


Using  Rule  8.7,  can  be  refined  into  TIE\^  The  following  rule 

allows  the  refinement  of  the  synchronization  statement 

while  \/j/\kBj^k  do  skip 

in  TIE\^^^^  into  a  parallel  composition  of  synchronization  statements 

lljwhile  do  skip, 

in  TIEpar.ati  if  disjunctions  are  always  evaluated  in  parallel. 

Refinement  Rule  8.8  (Refining  synchronization  statements) 

Let  C  be  of  the  form 

C  =  £^[while  Vj do  skip]  ||  D 
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where  E  is  sequential.  If  all  disjunctions  in  C  are  evaluated  in  parallel,  that 
is,  Bi  V  B2  ^2  for  all  Bi  and  ^2,  and  D  preserves  \/f^-^Bj^k  for  all 

1  <  j  <  m,  that  is, 

[tt,  Preds{  Var)]  D  [it,  {\/i,^Bj^k  |  1  <  i  <  nz}] , 
and  C  has  eventual  entry  property,  then 
Cpar  =  jE^[||jwhile  /\j^Bj^k  do  skip]  ||  D 

Cup  ^  E[for  y  ==  1  to  m  st  y  7^  ^  do  while  /\kBj^k  do  skip]  ||  D 
Cdown  =  E[for  j  —  m  downto  1  st  j  7^  Ar  do  while  skip]  ||  D 

also  have  the  eventual  entry  property  and 

C  Cpar  ^'p't  Cup  — Cdown- 

Proof:  We  prove  the  rule  for  Cpar  only.  The  cases  for  Cup  and  Cdown  are 
similar.  We  have 

E'[while  \/j/\kBj^k  do  skip]  ||  D 

Corollary  8.1 

it}*  ;  II  D 

=£t  (•) 

E[  Hi  {A,Bj,k  II  tty  ;  i^ff}]  II  D 

=£t  Corollary  8.1 

i?[||jwhile  AkBj,k  do  skip]  ||  D. 

The  equivalence  (*)  follows  from 

{MjPj  11  «}* ;  {\/jPj  i^ff}  =rt  [  Hi  {Pj  11  «}* ;  {^i  11  ff}] 

for  all  predicates  Pj^  1  <  j  <  m,  which  can  be  shown  by  induction  over  m.  ■ 

Note  that  the  introduction  of  fine-grained  boolean  expressions  does  not  inval¬ 
idate  the  properties  of  trace  equivalence  in  Lemma  2.1  and  of  trace  inclusion  in 
Lemma  2.2.  Moreover,  the  sufficient  condition  for  eventual  entry  in  Lemma  8.2 
and  the  Refinement  Rules  8.4  and  8.5  also  continue  to  hold. 


8.9  Discussion 

This  section  described  the  completely  rigorous  verification  of  the  n-process  tie¬ 
breaker  algorithm.  Mutual  exclusion,  deadlock  freedom  and  eventual  entry  were 
shown.  The  treatment  of  finer  levels  of  granularity  was  sketched.  Moreover, 
several  alternative  implementations  of  the  tie-breaker  algorithm  were  derived. 
Some  of  these  implementations  exhibit  more  parallelism  than  the  standard  text¬ 
book  implementation.  As  in  the  previous  examples,  the  refinement  framework 
has  allowed  us  to  expose  the  “degrees  of  implementation-freedom”  offered  by 
the  algorithm.  In  fact,  this  work  was  motivated  partly  by  the  question  whether 
TIE  down  and  TIE  par  would  indeed  be  correct  solutions. 
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Remarks  about  the  derivation 

A  few  remarks  about  the  derivations  in  this  section  are  in  order. 

•  The  n-process  bakery  algorithm  and  the  n-process  ticket  algorithm  as 
presented  in  Examples  8.2  and  8.3  could  have  been  verified  in  a  similar 
fashion.  However,  compared  to  the  tie-breaker  algorithm  these  two  algo¬ 
rithms  do  not  give  rise  to  the  same  number  of  different  refinements.  The 
resulting  refinement  trees  are  neither  as  deep  nor  as  bushy. 

•  Context-sensitive  approximation  and  thus  labels  were  essential  for  our 
treatment  of  eventual  entry  —  a  liveness  property  that  is  notoriously  hard 
to  establish.  More  precisely,  both  the  characterization  of  eventual  entry 
in  Lemma  8.1  and  the  sufficient  condition  expressed  in  Lemma  8.2  hinge 
on  context-sensitive  approximation.  Moreover,  the  Rules  8.4  and  8,5  that 
allow  the  replacement  of  a  multiple  assignment  xi, a:2:=vi, ^2  by  two  se¬ 
quential  assignments  xi  :=vi  ;X2:=V2  crucially  depend  on  context-sensitive 
approximation  for  their  correctness  proofs. 

•  Some  of  the  refinements  of  this  section  are  not  commutative.  The  refine¬ 
ment  of  the  synchronization  statements  from 

await  Vj  ^  iAn[j]  <  /  V  last[l]  ^  i 

into 

||”_i  j^^while  in[j]  >l  A  last[l]  =  i  do  skip 

for  instance,  crucially  depended  upon  the  preservation  of  the  predicate 
in[j]  <  I  V  last[l]  ^  i.  The  refinement  of  into  ; 

last[l]:=i  in  turn  depended  on  the  introduction  of  the  loop  counter  I  in 
the  first  refinement  TIE\^  However,  this  refinement  also  destroys  the 
preservation  of  the  above  property.  Thus,  this  step  had  to  be  postponed. 

Comparison  to  examples  in  Chapters  6  and  7 

Undoubtedly,  the  n-process  mutual  exclusion  problem  is  substantially  more  dif¬ 
ficult  than  the  problems  discussed  earlier.  Both  the  correctness  properties  and 
the  interactions  between  the  parallel  processes  are  a  lot  more  intricate.  The  ad¬ 
ditional  level  of  complexity  requires  a  treatment  that  differs  from  the  previous 
treatments,  mainly  in  two  aspects. 

•  Correct  behaviour  of  a  mutual  exclusion  algorithm  cannot  be  captured  as 
easily  and  concisely  as  in  the  previous  examples.  Rather  than  a  one-line 
or  two-line  program,  a  more  complex  program  has  to  be  chosen  as  the 
initial  specification.  While  the  correctness  of  the  initial  specification  is 
far  from  obvious,  it  still  is  abstract  enough  to  allow  for  a  straightforward 
verification.  The  tie-breaker  example  demonstrates  that  our  approach 
also  supports  the  development  of  programs  whose  correct  behaviour  is 
more  involved  and  impossible  to  capture  in  terms  of  standard  pre-  and 
postconditions,  for  instance. 
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•  In  the  previous  examples,  the  rules  offered  in  the  calculus  were  sufficient 
to  cover  the  entire  derivation.  In  this  example,  however,  new,  more  spe¬ 
cialized  rules  had  to  be  developed.  On  the  one  hand,  this  illustrates  that 
no  set  of  rules  will  ever  be  general  enough  to  cover  all  possible  derivation 
and  verification  needs.  On  the  other  hand,  it  also  provides  some  hope  that 
missing  rules  can  always  be  developed  without  too  much  effort. 
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Chapter  9 

Related  Work 


Our  work  presents  a  methodology  for  the  formal  development  of  concurrent 
programs.  The  results  rest  on  the  marriage  of  two  lines  of  existing  work: 

1.  the  formal,  systematic  development  and  verification  of  programs,  on  the 
one  hand,  and 

2.  the  work  on  semantic  models  for  concurrent  computation  on  the  other. 

The  following  two  sections  present  related  work  in  each  of  the  two  fields  in 
more  detail. 


9.1  Formal  program  development  and  verifica¬ 
tion 

9.1.1  Refinement  calculi  for  sequential  programs 

This  section  reviews  some  formalizations  of  stepwise  refinement  for  sequential 
programming.  While  some  concepts  used  in  these  formalizations  reappear  in 
our  work,  the  main  difference  is  that  they  do  not  address  concurrency. 

Hoare-triples  and  weakest  preconditions  induce  a  very  natural  notion  of 
refinement  between  two  sequential  programs.  C  is  refined  by  C‘  iff  every 
Hoare-triple  satisfied  by  C  also  is  satisfied  by  C',  that  is,  {P}  C  {Q}  implies 
{P}  C'  {Q}  for  all  P  and  Q,  Since  {P}  C  {Q}  holds  iff  P  implies  the  weakest 
precondition  of  C  with  respect  to  Q,  that  is,  P  =>  u;p((7,  Q),  refinement  between 
C  and  C'  can  also  be  expressed  as  wp{C^  Q)  implies  Q)  for  all  Q.  Infor¬ 

mally,  refinement  expresses  that  every  behaviour  of  C"  can  also  be  exhibited  by 
C.  Typically,  C'  exhibits  less  non-determinism  than  C.  The  calculi  by  Morris 
and  Morgan  to  be  presented  below  both  use  this  notion  of  refinement. 

Morris’  refinement  calculus.  Morris  was  one  of  the  first  people  to  use  weak¬ 
est  preconditions  for  a  formalization  of  the  program  development  process  sug¬ 
gested  by  Dijkstra  in  [Dij76].  Inspired  by  the  idea  to  embed  programs  and 
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specifications  within  the  same  framework,  e.g.,  [Abr85,  Heh84,  Hoa85],  he  in¬ 
troduces  prescriptions  (P,  Q)  to  specify  an  atomic  transition  that  ends  in  a  state 
satisfying  predicate  Q  provided  the  initial  state  satisfies  predicate  P.  For  in¬ 
stance,  (a?  =  0,x  =  1)  specifies  a  statement  that  sets  a?  to  1  if  started  in  an 
initial  state  with  a;  =  0.  It  is  thus  refined  by  ar:=l,  for  example.  Since  other 
variables  may  be  set  arbitrarily,  a:,y:=l,5  also  refines  the  same  prescription.  If 
variables  other  than  x  are  not  to  be  changed,  this  needs  to  be  expressed  ex¬ 
plicitly  in  the  prescription.  Q  is  a  simple  predicate  and  thus  specifies  only  the 
final  state  and  cannot  express  a  relation  between  initial  and  final  state.  This 
necessitates  the  need  for  auxiliary  variables  to  express  that  a  variable  does  not 
change  or  to  characterize,  for  instance,  the  behaviour  of  x:^x  +  1.  Assuming 
that  Q  is  a  property  over  an  infinite  domain  like  the  natural  numbers,  this 
means  the  loss  of  bounded  nondeterminism  [Mor87].  This  has  deep  semantic 
consequences  [Dij76,  Rey98].  For  a  specification  language,  however,  this  loss  is 
tolerable.  Morris  gives  a  weakest  precondition  semantics  to  prescriptions  and 
defines  refinement  in  terms  of  weakest  preconditions  just  as  outlined  above. 

Morgan’s  refinement  calculus.  Independently,  Morgan  proposed  specifica- 
tion  statements  V:[PyQ]  [Mor89].  Compared  to  Morris’  prescriptions,  specifica¬ 
tion  statements  allow  for  more  concise  specifications,  because  all  variables  not 
in  V  can  implicitly  be  assumed  to  remain  unchanged.  For  instance,  {a?}:[x  = 
0,  X  =  1]  specifies  a  statement  that  sets  x  to  1  if  the  initial  state  satisfies  x  =  0 
and  leaves  all  other  variables  unchanged.  Moreover,  using  primed  and  unprimed 
variables,  Q  is  capable  of  expressing  a  relation  between  initial  and  final  state 
which  allows  x:=x  -h  1  to  be  characterized  by  {x}:[<t,x'  =  x  -|- 1]  and  thus  ob¬ 
viates  the  need  for  auxiliary  variables.  The  resulting  refinement  calculus  rests 
on  the  same  notion  of  refinement  and  features  similar  rules  [Mor94,  MV94]. 
Morgan’s  specification  statement  resurfaces  in  our  work.  Note,  however,  that  in 
Morgan’s  setting  the  behaviour  of  K:[P,  Q]  in  an  initial  state  that  does  not  sat¬ 
isfy  P  is  completely  arbitrary.  Even  non-termination  is  possible.  In  our  setting, 
however,  in  initial  states  that  do  not  satisfy  P  the  statement  V:[P^Q]  exhibits 
no  behaviour  at  all.  More  concretely,  to  use  Morgan’s  interpretation,  we  would 
have  to  change  the  meaning  of  \/:[P,  Q]  from  our 

{(s,s')  I  (s,s')  \=  cfv:lP,Q]} 
to 

{(S,  s')  I  S  1=  P  z:}.  (s,  s')  [=  C/V:[P,Q]}* 

The  reason  for  this  change  basically  is  that  Morgan’s  implicative  interpretation 
is  inappropriate  for  our  purposes.  To  see  why,  consider,  for  instance,  our  trace- 
theoretic  definition  of  the  conditional 

if  B  then  Ci  else  C2  =  ({B}  ;  Ci)  V  ;  C2) 

where  {B}  =  0:[J5,B].  Under  Morgan’s  interpretation  of  U:[P,Q],  the  stutter¬ 
ing  step  {B}  in  the  then  branch  would  exhibit  arbitrary  behaviour  if  B  was 
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not  satisfied.  Consequently,  the  program  on  the  right  would  not  capture  the 
standard  behaviour  of  the  conditional  anymore. 

Note  that  the  notion  of  refinement  based  on  weakest  preconditions  is  not 
context-sensitive  in  the  sense  of  Section  4.2.  For  instance,  in  context 


a::=0;  [] 


the  program  x:=x-j-l  cannot  be  replaced  by  a;  :=  1,  because  wp{x :=xAl,x  —  2) 
does  not  imply  wp{x:=l^x  =  2).  Instead,  the  parts  of  the  context  must  be 
made  part  of  the  program.  In  this  case,  the  weakest  preconditions  of  the  entire 
program  must  be  computed  to  determine  that 

wp{x  :=0  ;  X  :=a?  -f  1,  Q) 


implies 


for  all  Q  (and  vice  versa). 


wp{x  :=0  ;  ic  :=1,  Q) 


Hehner’s  refinement  calculus.  Hehner’s  refinement  calculus  differs  from 
the  two  calculi  above  in  that  atomic  transitions  are  specified  not  by  using  pre- 
and  postconditions  but  rather  by  predicates  over  primed  and  unprimed  vari¬ 
ables  [Heh93].  For  instance,  x:=l  and  x:=x  -{-  1  are  expressed  as  ic'  =  1  A  Vj/  / 
x.y'  =  y  and  x'  =  x  1  A'iy  ^  x.y'  =  y  respectively.  Refinement  is  expressed 
by  implication.  x:-\  refines  a?'  >  0  because  a?'  =  1  A  Vy  ^  x.y^  =  y  implies 
x'  >  0.  Besides  sequential  programs  Hehner  also  considers  message-passing 
concurrency.  The  set  of  variables  used  by  each  process  must  be  disjoint.  Com¬ 
munication  is  achieved  through  the  introduction  of  message-passing  constructs. 
Liveness  properties  and  deadlock  are  dealt  with  through  a  distinguished  time 
variable.  No  attempt  at  dealing  with  fairness  or  achieving  compositional  verifi¬ 
cation  is  made. 


The  Z  notation.  Another  successful  and  widely  accepted  formal  program 
specification  and  development  methodology  is  Z  [Spi89,  PST96].  Like  Hehner^s 
work,  Z  also  uses  implication  to  define  refinement  but  also  uses  explicit  pre¬ 
conditions.  The  precise  relationship  between  the  notion  of  refinement  used  by 
Hehner  and  Z  on  the  one  hand,  and  the  weakest  precondition-based  notion  of 
refinement  used  by  Morris  and  Morgan  on  the  other  hand  is  unclear. 

Algebraic  approaches.  An  alternative  idea  is  to  use  algebraic  specifica¬ 
tion  techniques  to  express  the  desired  properties  of  the  system  to  be  devel¬ 
oped  [BKL‘^91].  The  resulting  approaches  emphasize  a  more  property-oriented 
and  axiomatic  style  of  specification  and  are  thus  based  on  a  theory  that  is 
rather  different  from  ours.  For  instance,  in  the  algebraic  setting  refinements 
arise  as  category-theoretic  morphisms  between  specifications.  Development 
frameworks  for  sequential  programs  that  employ  this  algebraic  approach  in¬ 
clude  the  CIP  project  (‘^Computer-aided,  Intuition-guided  Programming”)  in 
Munich  [BBB"*‘85,  B"^87,  Par90],  the  PSI,  CHI,  KIDS,  and  Specware  projects 
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at  Kestrel  Institute  [KB81,  Kot83,  Smi85,  Smi91,  Smi93,  SM96,  BS96,  BGL”^98], 
and  Extended  ML  [KST97]. 

9.1.2  Proof  systems  for  parallel  programs 

Research  in  programming  methodology  clearly  shows  that  the  step  from  sequen- 
tial  programming  to  concurrent  programming  is  not  a  trivial  one.  The  presence 
of  concurrency  complicates  any  theory  of  programming  substantially.  The  vast 
amount  of  research  in  this  area  bears  witness  to  this. 

OwickI  and  Gries’  work.  A  lot  of  attempts  have  been  made  to  find  an  ad¬ 
equate  extension  of  Hoare’s  logic  to  the  concurrent  setting.  Indeed,  the  first 
approaches  towards  a  formal  treatment  of  concurrency  by  Owicki  and  Gries  in 
1976  and  Lamport  in  1977  were  based  on  this  idea  [OG76a,  OG76b,  Lam77]. 
In  [OG76a,  OG76b],  Owicki  and  Gries  introduce  the  notion  of  interference  free¬ 
dom  to  obtain  a  syntax-directed  proof  system  for  a  shared-variable  concurrent 
language.  Hoare- triples  are  generalized  to  proof  outlines.  Whereas  a  Hoare- 
triple  only  captures  properties  of  the  initial  and  final  states,  a  proof  outline 
additionally  keeps  track  of  the  predicates  that  hold  during  the  execution  of  the 
program.  Formally,  a  proof  outline  is  an  annotated  program  in  which  any  two 
statements  are  separated  by  an  predicate  describing  the  properties  that  hold  at 
that  point.  To  prove  a  proof  outline  corresponding  to 

{P1AP2}  C1IIC2  {Q1AQ2} 

Owicki  and  Gries  suggest  to  first  prove  the  outlines  corresponding  to 
{Pi}  Cl  {Qi}  and  {P2}  C2  {Q2} 

and  then  to  check  that  the  two  proofs  are  interference-free,  that  is,  all  predicates 
used  in  the  first  outline  must  be  shown  to  be  preserved  by  all  assignments  and 
atomic  regions  in  C2  and  vice  versa.  Schneider  later  extended  this  approach  to 
proof  outline  logic  [SA86,  Sch97].  Independently,  Lamport  developed  an  idea 
that  is  similar  to  Owicki  and  Gries’  initial  approach  [Lam77]. 

Interference-freedom  is  a  potentially  very  complex  side-condition.  Moreover, 
it  renders  the  logic  non-compositional  and  thus  only  suitable  for  a-posteriori  ver¬ 
ification  of  existing  programs  and  unsuitable  for  program  development  through 
stepwise  refinement.  The  correctness  of  the  composition  can  only  be  deter¬ 
mined  after  Ci  and  C2  have  been  completely  developed.  Consequently,  the 
fact  that  the  parallel  composition  Ci  \\  C2  does  not  satisfy  the  Hoare- triple 
{Pi  A  P2}  Cl  II  C2  {Qi  A  Q2}  will  only  be  discovered  at  the  very  end  of  the 
design  of  Ci  and  C2.  In  a  compositional  proof  system  the  proof  rule  for  a 
composite  program  depends  only  on  the  specifications  of  the  immediate  compo¬ 
nents,  without  knowledge  of  the  interior  structure  of  these  components.  Thus, 
useless  development  efforts  like  the  above  are  avoided  from  the  start  leading  to 
a  much  more  efficient  development  process.  To  summarize,  Owicki  and  Gries’ 
work  differs  from  ours  in  that  it 
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•  is  geared  towards  program  verification  rather  than  program  development. 

•  has  no  support  for  liveness  properties  except  termination. 

•  does  not  handle  fairness. 

•  is  based  bn  a  coarse-grained  trace  semantics,  that  is,  while  the  notion 
of  trace  is  very  similar  to  transition  traces,  closure  conditions  are  never 
included  or  considered  assignments  are  assumed  to  be  atomic. 


The  next  major  contribution  was  the  insight  that  in  order  to  overcome  the 
problems  with  Owicki  and  Gries’  logic  and  to  achieve  true  compositionality  and 
thus  modular  proofs,  the  interference  between  concurrent  programs  had  to  be 
taken  into  account  at  the  specification  level.  This  insight  lead  to  assumption- 
commitment  reasoning  which  was  first  proposed  by  Francez  and  Pnueli  [FP78] 
and  Jones  [JonSl].  Given  a  parallel  composition  Ci  ||  C2,  program  Ci  is  shown 
to  behave  correctly  assuming  that  C2  behaves  in  a  certain  way  and  similarly 
for  (72.  Sometimes  the  correctness  of  Ci  and  the  correctness  of  C2  are  mutually 
dependent:  the  guarantees  of  (7i  influence  the  assumptions  of  C2  which  influence 
the  guarantees  of  C2  which  influence  the  assumptions  of  (7i  which  influence  the 
guarantees  of  Ci  and  so  on.  Care  must  be  taken  to  ensure  that  the  reasoning 
does  not  become  circular  and  thus  unsound.  Consider,  for  instance,  the  following 
two  programs. 

Cl  =  y:=0; 

await  y  =  1; 
x:=l 


and 


await  X  =  1; 

y:=l. 


Let  P(x,y)  stand  for 


P(x,  y)  =  If  the  environment  eventually  sets  x  to  1, 

then  the  program  will  eventually  set  y  to  1. 


Although  Cl  and  C2  satisfy  P(y,  x)  and  P{x^y)  respectively,  is  not  the  case 
that  Cl  II  C2  will  eventually  set  x  and  y  to  1.  x  being  set  to  1  by  Ci  depends  on 
y  being  set  to  1  by  C2  which  depends  on  x  being  set  to  1  by  Ci  which  depends 
on  ...  The  reasoning  is  circular  and  unsound.  While  all  proof  systems  using 
assumption-commitment  reasoning  in  the  literature  are  based  on  the  idea  of 
explicitly  specifying  the  assumptions  and  commitments  of  a  program,  there  is  a 
lot  of  variation  in  the  way  they  break  this  circularity.  We  will  now  review  some 
approaches  that  are  most  relevant  to  our  work. 


Francez  and  Pnueli’s  proof  system.  In  [FP78],  Francez  and  Pnueli  consider 
shared- variable  parallel  programs  in  which  each  variable  is  either  a  local  variable, 
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an  input  variable  or  an  output  variable.  The  behaviour  of  a  program  is  modeled 
by  sequences  of  states 

So  ^  ^2  •  •  • 

where  some  of  the  steps  may  have  been  performed  by  the  environment.  Despite 
the  absence  of  explicit  labeling  a  compositional  treatment  of  parallel  composition 
is  obtained,  because  program  and  environment  transitions  are  identifiable  by 
the  variables  that  they  change.  Specifications  are  of  the  form  (^,  where  <p 
and  are  formulas  involving  an  explicit  time  variable.  A  behaviour  meets  the 
specification  (^,  V')  iff  it  satisfies  ip  whenever  it  satisfies  (p.  The  rule  for  parallel 
composition  uses  explicit  induction  over  some  well-founded  set.  The  induction 
not  only  avoids  circularity,  but  also  allows  arbitrary  formulas  (including  liveness 
properties)  to  be  used  as  assumptions  and  commitments.  Pnueli  later  extended 
the  approach  to  use  linear  temporal  logic  [Pnu85].  The  work  differs  from  our 
refinement  calculus  in  that  it 

•  has  no  explicit  refinement  relation. 

•  does  not  handle  fairness. 

•  is  based  on  a  coarse-grained  trace  semantics.  All  three  approaches  use 
traces  as  the  underlying  semantic  model.  While  the  notion  of  trace  is 
very  similar  to  transition  traces,  closure  conditions  are  never  included  or 
considered,  which  results  in  a  rather  coarse-grained  semantics.  Moreover, 
assignments  are  assumed  to  be  atomic. 


Jones’  rely-guarantee  reasoning.  The  work  in  [JonSl,  Jon83a,  Jon83b]  also 
employs  shared- variables  as  the  means  of  communication.  So  called  potential 
computations 


Iq  lx  /a  li  h+i 

So  >  Si  >  $2  y  S3  .  .  .  Si  y  Si^i  1  S,'^.2  ^  S,‘+3  .  .  , 

describe  the  program  behaviour  where  each  /,•  is  a  label  ranging  over  {p,  e}. 
Transitions  labeled  with  p  indicate  program  transitions  whereas  the  label  e 
indicates  environment  transitions.  Quadruples  of  predicates  (PfR^G^Q)  specify 
the  behaviour  of  a  parallel  program.  The  pre-condition  P  together  with  the  rely- 
condition  R  constitute  assumptions  the  developer  of  a  program  can  make  about 
the  environment.  In  return  the  implementation  must  satisfy  the  guarantee- 
condition  G,  and  terminate  in  a  state  satisfying  the  post-condition  Q,  Thus, 

C^{P,R,G,Q) 

means  that  if  C  is  executed  in  initial  states  satisfying  P  and  in  environments 
that  change  the  state  only  according  to  it,  then  it  will  terminate  in  a  state 
satisfying  Q  and  will  only  change  the  state  according  to  G.  A  compositional, 
syntax-directed  proof  system  for  total  correctness  of  terminating,  unfair,  shared- 
variable  concurrent  programs  without  synchronization  based  on  rely-guarantee 
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reasoning  (we  have  been  using  the  term  assumption-commitment  reasoning)  is 
presented.  Circularity  is  broken  by  admitting  only  safety  properties  as  assump¬ 
tions.  This  restriction  allows  the  soundness  proof  of  the  parallel  composition  rule 
to  proceed  by  induction  over  the  length  of  the  behaviour.  Many  researchers  have 
extended  Jones’  approach.  Complete  compositional  proof  systems  are  presented 
in  [St091,  Xu92].  In  [St091],  St0len  also  extended  the  work  to  incorporate  syn¬ 
chronization  statements.  Compared  to  our  refinement  calculus,  rely-guarantee 
reasoning  in  the  style  of  Jones  differs  as  follows. 

•  It  has  no  explicit  refinement  relation. 

•  It  has  no  support  for  liveness  properties  except  termination. 

•  It  does  not  handle  fairness. 

•  The  context  can  only  be  described  in  terms  of  a  pre-  and  a  rely-condition. 
Our  context-sensitive  approximation,  however,  allows  the  use  of  arbitrary 
programming  contexts,  as  used,  for  instance,  in  the  formalization  of  even¬ 
tual  entry  in  Lemma  8.1  on  page  165, 

•  When  capturing  the  environment  assumptions  for  a  program,  rely-guarantee 
reasoning  emphasizes  conciseness  whereas  our  work  emphasizes  minimal¬ 
ity.  In  other  words,  while  the  description  of  the  environment  assumptions 
in  rely-guarantee  reasoning  typically  is  very  concise  (e.g.,  x  =x,  that  is,  x 
does  not  change),  it  often  also  is  stronger  than  necessary.  In  contrast,  the 
sets  of  predicates  used  in  our  work  typically  better  support  the  expression 
of  the  weakest  necessary  assumptions  (e.g.,  a?  =  1  and  a?  =  2  are  preserved) 
at  the  expense  of  being  more  unwieldy. 

•  It  is  based  on  a  coarse-grained  trace  semantics.  All  three  approaches  use 
traces  as  the  underlying  semantic  model.  While  the  notion  of  trace  is 
very  similar  to  transition  traces,  closure  conditions  are  never  included  or 
considered,  which  results  in  a  rather  coarse-grained  semantics.  Moreover, 
assignments  are  assumed  to  be  atomic. 

•  It  is  geared  towards  terminating  programs.  Non-terminating  programs 
cannot  be  handled. 


Stirling’s  generalization  of  Owicki  and  Gries’  work.  Stirling  presents 
a  generalization  of  Owicki  and  Gries’  proof  system  Again,  compositionality  is 
achieved  using  assumption-commitment  reasoning  [Sti88],  Specifications  of  the 
form 

[P,T]  C  [Q,A] 

are  used  to  express  that  if  executed  in  an  initial  state  satisfying  P  and  in  a 
parallel  environment  preserving  all  predicates  in  F,  (7  promises  to  terminate 
in  a  state  satisfying  Q  and  to  preserve  all  the  predicates  in  A.  Note  that  our 
interpretation  in  Chapter  3  differs  from  Stirling’s  only  in  that  the  parallel  envi¬ 
ronment  of  C  cannot  change  the  final  state  established  by  C.  More  precisely,  in 
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a  judgement  [P,  F]  C  [Q,  A]  that  was  derived  using  our  rules,  the  assumptions 
r  always  contain  enough  predicates  to  imply  Q,  that  is,  just  like  the  precondi¬ 
tion,  the  postcondition  needs  to  be  protected  from  the  environment  interference 
(Lemma  5.6).  In  Stirling’s  setting,  however,  F  typically  does  not  imply  Q,  be¬ 
cause  C  is  not  subject  to  interference  after  termination.  Like  in  Jones’  work, 
proofs  are  modular  due  to  the  use  of  assumption-commitment  reasoning  and  thus 
program  development  is  supported.  Moreover,  circular  reasoning  is  avoided  by 
using  safety  properties  as  assumptions.  Compared  to  our  work,  Stirling’s  work 
differs  as  follows. 

♦  It  has  no  explicit  refinement  relation. 

•  It  has  no  support  for  liveness  properties  except  termination. 

•  It  does  not  handle  fairness. 

♦  Like  Jones’  rely-guarantee  reasoning,  the  context  can  only  be  described  in 
terms  of  a  pre-  and  a  rely-condition  while  our  context-sensitive  approxi¬ 
mation  allows  more  general  descriptions. 

•  It  is  based  on  a  coarse-grained  trace  semantics.  All  three  approaches  use 
traces  as  the  underlying  semantic  model.  While  the  notion  of  trace  is 
very  similar  to  transition  traces,  closure  conditions  are  never  included  or 
considered,  which  results  in  a  rather  coarse-grained  semantics.  Moreover, 
Stirling  assumes  assignments  to  be  atomic. 

♦  It  is  geared  towards  terminating  programs.  Non-terminating  programs 
cannot  be  handled. 

Other.  A  large  number  other  proof  systems  for  concurrent  languages  exist.  We 
have  concentrated  here  on  the  ones  that  seem  closed  to  our  work.  Overviews 
of  compositional  proof  systems  for  concurrent  languages  can  be  found  in  [dR85, 
HdR86,  dRdBH+00]. 

9.1,3  Transformation  frameworks  for  parallel  programs 

Another  line  of  research  has  explored  the  use  of  semantics  preserving  trans¬ 
formations  for  the  design  and  verification  of  parallel  programs.  One  major 
difference  to  refinement  calculi  and  thus  our  work  is  that  these  transformations 
are  based  on  semantic  equivalences  rather  than  (trace- theoretic)  approximations 
and  inclusions. 

Jones’  object-based  design  notation  7ro/3X.  A  lot  of  effort  has  been  invested 
into  getting  a  formal  handle  on  the  notion  of  interference.  Jones  seminal  work 
on  rely-guarantee  reasoning  has  already  been  mentioned.  In  more  recent  work, 
Jones  argues  that  concepts  from  object-oriented  languages  present  a  promising 
way  of  taming  interference.  In  [Jon96],  he  presents  an  object-based  design  no¬ 
tation  called  7ro/3X  (read  “pobble”)  that  features  the  data  encapsulation  typical 
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for  object-oriented  language  but  no  inheritance.  In  tto/SX,  objects  that  belong 
to  a  class  marked  as  unique  are  never  shared  and  thus  cannot  interfere  with 
each  other.  Jones  uses  this  fact  to  formulate  two  equivalences  between  tto/IX 
programs  which  allow  a  sequential  program  to  be  replaced  by  an  equivalent  par¬ 
allel  counterpart.  For  situations  in  which  objects  are  subject  to  interference,  he 
shows  that  standard  rely-guarantee  reasoning  meshes  well  with  the  object-based 
program  notation.  Despite  some  promising  results,  the  work  seems  still  in  its 
initial  stages.  An  appropriate  semantics  is  needed  to  prove  the  equivalences  in 
general,  to  facilitate  the  discovery  of  new  equivalences,  and  to  develop  a  proof 
theory.  The  definition  of  such  a  semantics  has  proved  a  challenge.  Two  lines  of 
research  exist.  While  it  is  straightforward  to  define  a  mapping  from  tto/SX  to  the 
TT-calculus,  this  approach  has  so  far  only  been  able  to  produce  proofs  of  specific 
examples  of  the  equivalences.  The  definition  of  an  operational  semantics  in  the 
style  of  Plotkin  [Plo91]  has  been  more  successful  [HJ].  Compared  to  our  work, 
Jones  is  more  concerned  with  aiding  verification  by  removing  parallelism  rather 
than  formally  constructing  programs  from  specifications. 

Apt  and  Olderog’s  combination  of  proof  outlines  and  transformations. 
Apt  and  Olderog  complement  Owicki  and  Cries’  proof  outline  logic  with  pro¬ 
gram  transformations  [A091].  This  allows  them,  for  example,  to  deal  with  fair¬ 
ness  assumptions  by  using  program  transformations  that  embed  a  fair  scheduler 
into  a  given  nondeterministic  or  concurrent  program  [A083,  OA88].  Rather 
than  verifying  the  original  program  under  the  fairness  assumption,  the  trans¬ 
formed  program  is  verified  without  fairness  assumption.  The  embedded  fair 
scheduler  ensures  the  soundness  of  this  approach.  Note,  however,  that  the  spe¬ 
cific  implementation  of  the  scheduler  now  has  to  be  included  in  the  reasoning. 
An  additional  use  of  program  transformations  is  for  the  construction  of  con¬ 
current  programs  from  sequential  ones  or  of  computationally  complex  programs 
from  simpler  ones.  In  contrast  to  our  work,  this  approach  has  no  refinement 
relation.  Since  it  is  based  on  proof  outline  logic,  it  is  syntax-directed,  but  not 
compositional.  Just  like  in  our  setting,  program  transformations  can  be  used 
to  introduce  parallelism.  The  treatment  of  fairness,  however,  is  fundamentally 
different. 


This  thesis  and  the  related  work  presented  so  far  was  only  concerned  with 
managing  the  complexity  of  concurrency  from  a  program  development  point  of 
view.  As  mentioned  in  the  introduction,  this  is  not  the  only  reason  why  parallel 
computers  have  not  had  more  of  an  impact  on  mainstream  computing.  The 
other  impediment  is  the  diversity  of  parallel  machine  models  which  results  in  a 
lack  of  portability  and  predictability  of  performance.  There  is  a  huge  number 
of  sometimes  very  different  parallel  architectures.  Diiferent  classes  of  parallel 
architectures  require  radically  different  paradigms  for  describing  and  executing 
computations  efficiently.  A  number  of  different  attempts  to  solve  this  problem 
have  been  made.  We  will  now  review  two  attempts  that  employ  transformational 
programming. 
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Categorical  datatypes.  The  Bird-Meertens  formalism  [Mee86,  Bir87,  Bir89] 
consists  of  a  collection  of  theories  built  on  a  base  algebra.  Each  theory  cap>- 
tures  the  behaviour  of  a  particular  class  of  data  structures.  Theories  have  been 
developed  for  lists,  trees  and  arrays.  Skillicorn  shows  that  the  Bird-Meertens 
formalism  is  a  universal,  machine-independent  model  for  four  different  archi¬ 
tecture  classes  [Ski90].  He  moreover  argues  that  this  model  also  addresses 
the  software  engineering  difficulties  of  parallel  programming  through  its  sup¬ 
port  for  transformational  programming.  The  rich  set  of  algebraic  identities  not 
only  allows  the  stepwise  transformation  of  an  inefficient,  easy-to-understand 
program  into  an  efficient  implementation,  but  also  might  be  the  key  to  more 
architecture-dependent  optimization  and  adaptation.  Cotcgorical  dototypBS  are 
a  generalization  of  this  work  [Ski94].  They  capture  a  style  of  second-order 
functional  programming  with  strong  mathematical  (categorical)  properties  that 
support  transformation  and  reasoning  about  parallel  programs.  They  subsume 
languages  like  Gamma  [BM91],  Parallel  SETL  [HK93],  and  NESL  [Ble92].  This 
work  differs  from  ours  in  that  it  is  not  concerned  with  formal  program  develop¬ 
ment  from  specifications.  Moreover,  it  includes  complexity  considerations  which 
are  ignored  in  our  work.  Finally,  both  the  Bird-Meertens  formalism  and  cate¬ 
gorical  datatypes  use  implicit  parallelism,  that  is,  the  compiler  introduces  the 
parallelism  without  user  interaction.  Explicit,  user-level  parallelism  as  used  in 
our  work  is  not  considered. 

Skeletons.  The  next  approach  also  has  its  roots  in  functional  programming. 
Higher-order  functions  with  a  lot  of  implicit  parallelism,  so  called  skeletons y 
are  used  as  the  basic  building  blocks  for  parallel  implementations.  Portabil¬ 
ity  is  achieved  through  program  transformations  that  convert  between  skele¬ 
tons  [DFH+93,  Bra94].  A  skeleton  exposes  only  its  declarative,  functional  mean¬ 
ing  to  the  programmer,  while  a  particular  implementation  of  that  skeleton  for 
a  particular  architecture  is  responsible  for  the  generation  of  efficient,  highly 
parallel  code.  Most  skeletons  are  defined  in  terms  of  the  higher-order  func¬ 
tions  Tfiapy  filter  and  reduce.  Like  categorical  datatypes,  skeletons  are  based 
on  implicit  parallelism.  It  would  be  interesting  to  see  if  transformational  ap¬ 
proaches  to  the  efficient  compilation  of  programs  with  explicit  parallelism  exist. 
A  close  look  at  the  work  of  SUIF  compiler  project  would  be  interesting  in  this 
respect  [HAA*^96]. 

Other.  A  number  of  other  people  have  used  program  transformation  for  the 
construction  of  concurrent  programs  from  sequential  ones,  e.g.,  [AM71,  Lip75, 
FS81,  Len82,  Apt86].  We  will  not  review  these  ideas  here,  because  either  they 
have  been  used  in  work  that  we  have  already  discussed,  e.g.,  [A091],  or  they 
are  not  directly  related  to  our  work. 

9.1.4  Refinement  calculi  for  parallel  programs 

The  complexity  inherent  to  concurrent  programming  makes  formal  approaches 
to  their  development  particularly  appealing.  Consequently,  there  has  been  quite 
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a  lot  of  work  in  this  area.  We  will  only  review  the  approaches  that  seem  the 
most  relevant,  but  we  will  also  choose  a  more  detailed  presentation  style,  since 
our  work  also  fits  into  this  section. 

Back’s  Action  Systems.  Back’s  Action  Systems  provide  an  alternative  and 
considerably  more  abstract  way  to  model  parallel  computation  [Bac89].  The 
behaviour  of  a  sequential,  concurrent  or  distributed  system  is  described  in  terms 
of  the  actions  that  the  processes  in  the  system  carry  out  in  cooperating  with 
each  other.  Formally,  an  Action  System  is  a  set  of  actions  operating  on  local 
and  global  variables  and  has  the  form 

A  =  [  Var  Xi=Vi...,Xn  =  Vn 

proc  Pi=  Pi,.,.,Pn  =  Pfi 
do  Ai||...|K  od 

] 

where  each  action  Ai  is  a  guarded  command  g  — y  S  with  guard  g  and  sequence 
of  assignments  S.  Actions  are  atomic  and  may  be  executed  in  parallel,  as  long 
as  they  do  not  have  any  variables  in  common.  Atomicity  of  actions  guarantees 
that  a  parallel  execution  of  an  Action  System  yields  the  same  results  as  the 
sequential  execution.  Atomicity  simplifies  the  programming  task  and  the  proof 
theory.  Parallel  Action  Systems  can  be  described  in  terms  of  sequential  guarded 
command  language.  Back  presents  a  refinement  calculus  for  Action  Systems 
where  refinement  expresses  the  preservation  of  total  correctness.  For  reactive 
systems,  the  stronger,  trace-based  notion  of  strong  simulation  refinement  is  sug¬ 
gested.  The  refinement  rules  are  not  syntax-directed.  Consequently,  refinement 
typically  cannot  be  derived  truly  compositionally.  To  refine  a  reactive  system, 
the  environment  is  partitioned  into  actions  that  can  potentially  influence  the 
behaviour  of  the  program  and  those  that  cannot.  The  first  group  constitutes 
the  interface  between  the  program  and  the  environment.  The  program  and  its 
interface  are  refined  simultaneously.  Consequently,  this  decomposition  is  only 
of  value  when  the  interface  is  small  compared  to  the  entire  environment.  More¬ 
over,  Action  Systems  do  not  have  any  kind  of  fairness  conditions  build  into 
them.  Fairness  constraints  have  to  be  encoded  by  means  of  an  explicit  sched¬ 
uler  as  suggested  by  Apt  and  Olderog  [A091].  To  summarize,  the  refinement 
calculus  for  Action  Systems  differs  from  ours  as  follows. 

♦  Action  Systems  form  a  different,  more  abstract  computational  model. 

#  The  rules  of  the  calculus  are  not  syntax-directed, 

•  Specifications  and  programs  are  syntactically  distinguished. 

•  Fairness  assumptions  are  not  directly  modeled  in  the  semantics,  but  must 
be  encoded  into  a  scheduler. 

Qiwen  Xu  and  He  Jifeng’s  work.  The  work  of  Xu  and  Jifeng  further  ex¬ 
tended  rely-guarantee  reasoning  [XJ91].  Like  in  ProCoS  and  in  our  work,  traces 
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are  used  as  a  unifying  semantic  model  for  both  programs  and  specifications  to 
obtain  a  unified  framework.  A  refinement  relation  between  specifications  is  pre¬ 
sented.  Finally,  an  implementation  relation  bridges  the  gap  between  programs 
and  specifications  and  can  be  derived  with  a  compositional  proof  system.  To 
summarize  Xu  and  Jifeng’s  work  differs  from  ours  as  follows: 

•  In  [XJ91],  both  programs  and  specifications  are  mapped  to  the  same  math¬ 
ematical  structure  (traces),  and  are  thus  treated  the  same  semantically. 
Syntactically,  however,  they  are  still  distinct.  In  our  work,  there  is  no  se¬ 
mantic  or  syntactic  difference  between  the  two.  Programs  are  executable 
specifications. 

•  The  framework  is  geared  towards  total  correctness  and  terminating  pro¬ 
grams.  Fairness  thus  is  not  handled. 

♦  Apart  from  termination  no  other  liveness  properties  can  be  handled. 

♦  Message-passing  concurrency  is  not  addressed. 

Previous  work  by  the  author.  The  definition  of  refinement  employed  in 
this  thesis  rests  on  a  context-sensitive  notion  of  approximation  which  in  turn 
rests  on  labeled  transition  traces.  Both  notions  were  introduced  in  [Din96]. 
In  [Din97],  context-sensitive  approximation  is  employed  directly  as  refinement 
relation.  Rules  are  given  that  allow  the  replacement  of  a  component  by  some 
other  component  under  certain  minimal  context  assumptions,  that  is,  in  a  con¬ 
text  that  has  a  certain  maximal  discriminating  power.  A  preorder  Ei  C  E2  on 
contexts  similar  to  the  one  presented  in  Section  4.3  captures  the  capabilities  of 
a  context  for  interference,  that  is,  their  discriminating  power.  The  refinement 
process  is  similar  to  the  one  used  in  this  thesis.  Refinement  of  a  component  in 
some  environment  Ei  involves  finding  the  appropriate  rule,  and  showing  that 
El  respects  the  assumptions  E2  expressed  in  the  rule.  A  major  difference  is, 
however,  that  the  refinement  relation  itself  does  not  contain  guarantees.  The 
context  preorder  is  used  to  show  that  Ei  respects  the  assumptions  expressed 
in  E2^  In  other  words,  the  interplay  between  context-sensitive  approximation 
between  programs  and  approximation  between  contexts  allows  compositional 
proofs. 

The  simple  syntactic  structure  of  UNITY  also  allowed  the  formulation  of 
context-sensitive  approximation  as  a  game-playing  activity.  Stirling  demon¬ 
strates  how  the  verification  of  labeled  transition  systems  with  respect  to  mu- 
calculus  formulas  can  be  recast  using  game  theory  [Sti96].  In  general,  proving 
that  a  program  C  meets  its  specification  (p  can  be  given  the  following  natu¬ 
ral  game-theoretic  interpretation.  An  adversary  plays  legal  environment  moves 
to  cause  C  to  deviate  from  its  specification.  If  she  succeeds,  then  C  violates 
its  specification.  If  she  never  succeeds,  then  C  satisfies  its  specification.  In 
sequential  programming,  for  example,  (p  could  be  a  Hoare-triple  {P}  C  {Q}. 
The  moves  by  the  adversary  would  then  be  confined  to  the  very  beginning  of 
the  play  where  the  adversary  would  try  to  find  an  initial  state  that  satisfies  P 
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but  still  causes  C  to  terminate  in  a  state  that  does  not  satisfy  Q.  In  the  con¬ 
current  world,  <p  could  be  some  kind  of  assumption-commitment  specification 
that  places  assumptions  on  the  behaviour  of  the  environment  of  C  and  in  turn 
makes  certain  guarantees  about  the  behaviour  of  C  whenever  C  is  executing 
in  an  environment  that  meets  these  assumptions.  The  adversary  now  has  sub¬ 
stantially  more  means  at  her  disposal  to  show  that  C  violates  the  specification. 
She  can  not  only  interfere  before  but  also  during  the  execution  of  C  and  change 
the  state  arbitrarily  as  long  as  she  observes  the  assumptions.  In  [Din97]  we 
show  how  games  can  also  provide  an  appealing  metaphor  for  context-sensitive 
approximation  Ci  >e  C2  in  UNITY  and  thus  for  the  compositional  refinement 
or  verification  of  concurrent  systems.  Intuitively,  the  game-theoretic  interpreta¬ 
tion  of  Cl  >E  C2  is  as  follows.  Suppose  that  the  adversary  makes  moves  in  both 
the  environment  E  and  program  C2  while  the  player  controls  Ci.  In  [Din97], 
we  prove  that  Ci  >e  C2  iff  there  is  there  is  no  sequence  of  moves,  alternating 
between  player  and  adversary,  which  ends  in  a  state  in  which  the  adversary  can 
find  a  transition  of  C2  for  which  the  player  cannot  find  a  matching  transition 
of  Cl.  In  the  light  of  this  game-theoretic  characterization,  the  context  preorder 
El  C  E2  can  be  interpreted  as  comparing  the  “repertoire”  of  moves  that  Ei 
and  E2  offer.  For  example,  a  game  involving  E2  =  □  ||  Var\\tiyUY  is  easier  for 
the  adversary  to  win  than  a  game  involving  Ei~  []  |{  inv*x^  because  E2  offers 
a  larger  repertoire  of  moves  for  the  adversary. 

Note  that  the  game-theoretic  interpretation  of  Ci  >e  C2  is  reminiscent  of 
the  notion  of  simulation  on  transition  systems  [Mil89].  More  precisely,  given  the 
labeled  transition  systems  for  E[{Ci)]  and  E[{C2)]  we  could  define  a  simulation 
relation  that  keeps  program  and  environment  transitions  distinct  through  the 
labeling.  Besides  the  fact  that  context-sensitive  approximation  is  a  linear-time 
notion,  whereas  simulation  is  a  branching- time  notion,  context-sensitive  ap¬ 
proximation  differs  from  simulation  in  two  additional  ways.  First,  the  matching 
between  two  states  does  not  have  to  be  exact  but  only  relative  to  the  non¬ 
local  variables.  Second,  just  like  the  trace  sets  in  our  semantics,  the  transition 
systems  would  have  to  be  closed  under  stuttering  and  mumbling. 

While  the  work  presented  in  this  document  grew  out  of  this  early  work,  there 
are  a  number  of  substantial  differences. 

♦  As  sketched  above,  the  notion  of  assumption-commitment  reasoning  is 
quite  different.  In  particular,  we  do  not  use  a  context-preorder  to  express 
that  a  context  satisfies  certain  assumptions. 

♦  The  refinement  rules  in  [Din97]  are  not  syntax-directed  and  rather  ad  hoc. 

♦  The  work  concentrates  on  UNITY-style  shared- variable  concurrency  and 
message-passing  concurrency  is  not  addressed. 

The  work  reported  in  [Din99b]  is  a  direct  extension  of  the  ideas  described 
above  and  a  direct  precursor  of  the  work  presented  here.  Rather  than  UNITY,  a 
simple  shared- variable  while  language  with  synchronization  is  targeted.  Message¬ 
passing  is  missing.  Stirling’s  assumption-commitment  formulas  and  context- 
sensitive  approximation  are  combined  to  form  the  refinement  relation.  Rules 


214 


CHAPTER  9.  RELATED  WORK 


similar  to  the  ones  in  this  document  are  given  and  the  bank  account  problem  of 
Section  6.1  is  discussed. 

A  refinement  calculus  for  BSP.  Bulk  synchronous  parallelism  (BSP)  [SHM96, 
Val90]  is  a  parallel  programming  model  that  abstracts  from  low-level  program 
structures  in  favour  of  super  steps.  A  superstep  consists  of  a  set  of  independent 
local  computations,  followed  by  a  global  communication  phase  and  a  barrier  syn¬ 
chronization.  An  advantage  of  BSP  programs  is  that  their  cost  can  be  accurately 
be  determined  for  a  few  simple  architectural  parameters,  namely  the  permeabil¬ 
ity  of  the  communication  network  and  the  duration  of  a  synchronization  step. 
Moreover,  barrier  synchronizations  in  general  turned  out  to  be  not  as  expensive 
as  expected.  As  a  result,  the  structure  inherent  to  BSP  brings  considerable 
benefits  from  an  application-building  perspective  without  major  performance 
penalties.  Indeed,  the  advocates  of  BSP  regard  it  as  a  promising  candidate  for 
a  viable,  architecture-independent  model  for  parallel  programming. 

Skillicorn  addresses  the  issue  of  parallel  software  construction  and  extends 
Morgan’s  refinement  calculus  to  allow  for  the  formal  design  of  programs  in  the 
BSP  style  [Ski98].  He  models  a  distributed-memory  architecture  by  splitting 
the  frame  V  in  Morgan’s  specification  statement  into  two  parts  (r/,  u;/):[P,  Q] 
where  rf  is  the  read  frame  and  wf  is  the  write  frame.  To  express  information 
about  which  processor  holds  which  value  location  predicates  Pi(x)  are  intro¬ 
duced.  Pi(x)  holds  if  the  value  of  variable  x  resides  in  processor  i.  A  final  step 
is  the  addition  of  three  constructs  distribute,  collect  and  redistribute  for  the 
movement  of  values  using  the  distribution  implied  by  the  location  predicates. 
The  resulting  refinement  calculus  is  shown  to  be  a  conservative  extension  of 
Morgan’s  calculus  for  sequential  programs.  A  major  difference  to  our  approach 
is  that  variables  cannot  be  shared  across  processors.  In  the  BSP  model,  the 
computation  on  local  data  is  followed  by  a  barrier  synchronization  step  which 
realizes  the  data  exchange  needed  for  the  next  local  computation.  It  would  be 
interesting  to  see  how  much  our  approach  could  support  the  BSP  model. 

UNITY.  Just  like  Action  Systems,  the  primary  concern  in  UNITY  is  the  logic 
design  of  concurrent  programs  [CM88].  Compared  to  Action  Systems,  how¬ 
ever,  UNITY  takes  an  even  more  abstract  view  on  concurrent  programming.  It 
separates  what  problem  is  to  be  solved  from  when^  where,  and  how  this  can  be 
achieved.  The  what  is  specified  in  a  program,  whereas  the  when,  where,  and  how 
are  specified  in  a  mapping  which  describes  how  the  constructs  and  variables  are 
to  be  mapped  to  a  particular  architecture.  This  separation,  which  is  much  more 
rigorous  than  for  Action  Systems  for  example,  allows  for  a  simple  programming 
notation  that  is  appropriate  for  a  wide  variety  of  architectures.  Together  with 
a  strong  emphasis  on  non-operation al  reasoning,  it  is  this  simplicity  that  makes 
the  UNITY  approach  very  appealing.  On  the  other  hand,  however,  due  to  the 
high  level  of  abstraction,  the  UNITY  approach  means  a  quite  radical  departure 
from  traditional  methods  for  program  development  and  verification.  The  im¬ 
plementation  of  a  UNITY  program  on  some  parallel  machine,  for  instance,  can 
be  a  non- trivial  task.  Chandy  and  Misra  describe  the  situation  as  follows: 
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“Of  course,  this  simplicity  is  achieved  at  the  expense  of  making 
mappings  immensely  more  important  and  more  complex  than  they 
are  in  traditional  programs”  [CM88j  page  9]. 

Moreover,  due  to  the  absence  of  control  flow  in  UNITY,  existing  theories  for 
program  development  and  verification  are  not  applicable. 

UNITY  programs  are  based  on  a  simple,  computational  model.  A  UNITY 
program  is  of  the  form 


Program  name 
declare  declarations 
always  invariants 
initially  precondition 
assign  guarded- commands 
end 

where  the  guarded-commands  are  executed  in  an  infinite  loop  from  an  initial 
state  satisfying  the  precondition.  On  any  iteration  of  the  loop,  a  statement 
whose  guard  is  true  is  executed.  If  the  guards  of  several  statements  are  true,  a 
choice  is  made  nondeterministically.  The  choice  is  subject  to  a  fairness  condition 
saying  that  each  statement  must  be  chosen  infinitely  often.  All  states  encoun¬ 
tered  during  the  execution  will  satisfy  the  invariants.  An  execution  that  has 
reached  a  fixed  point,  that  is,  that  has  ceased  to  change  the  state,  is  regarded 
as  terminated.  Note  that  sequential  composition  is  not  offered  as  a  language 
construct,  but  has  to  be  encoded  by  the  programmer. 

Development  and  verification  of  UNITY  programs  is  supported  by  specific 
UNITY  logic  that  contains  primitives  for  the  expression  of  safety  and  liveness 
properties.  Specifications  are  collections  of  properties.  Stepwise  refinement  is 
achieved  by  strengthening  specifications.  Methods  to  compose  larger  programs 
from  smaller  ones  are  suggested.  The  formal  design  of  UNITY  programs  using 
stepwise  refinement  thus  is  reminiscent  of  the  approaches  based  on  algebraic 
specifications  like  KestrePs  KIDS.  Most  nonalgebraic  development  techniques 
propose  a  program  skeleton  and  then  flesh  out  an  underspecified  part  in  each 
refinement  step.  This  has  the  advantage  that  the  overall  structure  of  the  pro¬ 
gram  is  apparent  at  the  early  stages  of  the  design.  In  UNITY,  however,  the 
specification  itself,  i.e.,  the  logical  description  of  the  desired  properties  of  the 
program,  is  refined.  At  each  refinement  step,  some  program  properties  proposed 
in  previous  steps  are  replaced  by  other,  more  detailed  properties.  This  means 
that  the  program  structure  may  not  be  visible  until  the  later  stages  of  the  de¬ 
sign.  Moreover,  as  opposed  to  most  other  approaches,  the  refinement  relation  is 
based  on  properties.  The  superposition  theorem  (in  UNITY  refinement  is  called 
superposition)  makes  this  very  obvious. 

“Every  property  of  the  underlying  program  is  a  property  of  the  trans¬ 
formed  program”  [CM88,  page  165]. 

Although  trace  semantics  have  been  given  for  UNITY  in  a  number  of  places, 
e.g.  [CM88,  Liu89,  dBKPR91,  UK93b,  UK93a,  Din97],  only  a  few  approaches 
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use  them  to  define  a  trace-based  notion  of  refinement  [UK93b,  UK93a,  Din97]. 
Udink  and  Kok  show  that  trace-based  notions  are  strictly  finer  grained,  that 
is,  less  abstract,  than  property-based  notions  [UK93b].  One  significant  draw¬ 
back  of  UNITY,  as  presented  in  [CM88],  is  its  non-compositional  computational 
model.  The  behaviour  of  a  composite  program  is  not  described  solely  in  terms 
of  its  components,  making  it  hard  to  reason  compositionally.  This  deficiency 
has  been  addressed  in  work  [Col94,  CK95]  which  views  a  UNITY  program  as  an 
open  system  which  is  subject  to  interruptions  and  intermediate  state  changes  by 
the  environment.  In  his  thesis,  Collette  [Col94]  defines  a  compositional  trace  se¬ 
mantics  for  UNITY  based  on  potential  computations.  He  augments  the  UNITY 
logic  with  assumption-commitment  specifications  and  uses  a  composition  princi¬ 
ple  similar  to  the  one  proposed  by  Abadi  and  Lamport  [AL93]  to  equip  UNITY 
with  a  compositional  parallel  proof  rule  for  assumption-commitment  specifica¬ 
tions.  The  result  is  a  syntax-directed,  compositional  complete  proof  system. 
However,  this  extension  also  treats  specifications  of  components  as  conjunc¬ 
tions  of  properties.  This  casts  some  doubt  on  the  scalability  of  this  approach 
in  certain  cases,  because  more  complex  components  increase  the  danger  that 
properties  affect  each  other  in  intricate  and  unexpected  ways.  Moreover,  the  se¬ 
mantics  of  local  variables  is,  in  our  opinion,  not  sufficiently  abstract.  While  the 
environment  cannot  change  the  values  of  local  variables,  it  can  observe  changes 
to  them.  Consequently,  the  programs 

new  X  =  1  in  a;  :=ar  -f  1  end  new  x  =  1  in  x  :=a:  -f-  2  end 

are  not  considered  equivalent.  To  summarize,  the  main  differences  between 
UNITY  and  our  approach  are: 

•  UNITY  is  built  on  a  different,  very  abstract  computational  model. 

•  UNITY  programs  are  specified  using  a  specific  temporal  logic. 

•  Refinement  is  property-based  and  not  trace-based. 

•  Scope  and  locality  are  handled  differently. 


Abadi  and  Lamport’s  work.  Abadi  and  Lamport  free  themselves  of  any 
particular  program  syntax  or  paradigm  and  study  parallel  programming  from  a 
very  abstract  point  of  view  that  strives  to  replace  operational  by  logical  reason¬ 
ing.  In  [AL93],  they  isolate  in  very  general  terms  the  conditions  under  which 
assumption-commitment  specifications  for  the  components  of  a  system  imply 
an  assumption-commitment  specification  of  the  overall  system.  These  condi¬ 
tions  are  captured  in  a  proof  rule  called  the  composition  principle.  In  [AL91], 
they  use  a  very  general,  abstract,  semantic  setting  to  study  refinement  map- 
pings.  Behaviours  are  sequences  of  states  closed  under  stuttering.  A  refinement 
mapping  is  used  to  verify  that  a  lower-level  specification  correctly  implements 
a  higher-level  one.  In  their  setting,  specifications  are  given  as  state  machines 
and  refinement  mappings  map  the  state  space  of  the  lower-level  machine  5i 
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to  the  state  space  of  the  higher-level  machine  82-  The  main  contribution  is  a 
completeness  result:  If  5i  implements  82  and  certain  reasonable  assumptions 
are  satisfied,  then  by  adding  auxiliary  variables  the  existence  of  a  refinement 
mapping  between  81  and  ^2  can  be  guaranteed.  Refinement  through  refine¬ 
ment  mappings  is  stronger  than  trace  inclusion.  Due  to  the  generality  of  their 
setting,  their  results  are  applicable  to  a  variety  of  approaches  to  modeling  con¬ 
current  computation.  However,  since  they  do  not  consider  a  particular  syntax, 
no  syntax-directed  proof  system  is  given. 

ProCoS.  The  ProCoS  project  (“Provably  Correct  Systems”)  resembles  the  CIP 
project  in  its  goals.  It  is  a  comprehensive  wide-spectrum  verification  project  that 
studies  embedded,  concurrent  and  communicating  systems  at  various  levels  of 
abstraction  [6*^89].  The  levels  encompass  requirements’  capture,  specification 
language,  programming  language  and  machine  language.  The  principal  goal 
of  the  project  is  to  formally  connect  all  these  different  levels  of  abstraction 
through  stepwise  transformation  and  thus  allow  the  development  of  concurrent 
systems  that  are  correct  by  construction.  A  specification  language  is  used  that 
combines  trace-based  with  state-based  assertional  reasoning.  The  trace  part 
specifies  safety  and  liveness  properties  of  the  communication  behaviour  of  pro¬ 
cesses.  The  state  part  consists  of  state  variables  and  communication  assertions 
describing  when  a  channel  is  enabled  for  communication  and  what  the  effect 
of  a  communication  is.  Using  a  set  of  transformation  rules,  a  specification  is 
first  successively  transformed  into  a  distributed,  concurrent  OCC AM-like  pro¬ 
gram  [Ltd84,  ORSS92,  OR93].  The  resulting  program  is  then  mapped  to  a 
machine  language.  The  theoretical  foundation  of  ProCoS  is  strongly  influenced 
by  [01d91].  This  approach  shares  with  our  work  the  emphasis  on  program  devel¬ 
opment  through  stepwise  transformation  and  refinement  and  the  use  of  traces  as 
a  specification  tool.  However,  since  ProCoS  addresses  message-passing  concur¬ 
rency  and  not  shared-memory  concurrency,  the  notion  of  trace  employed  there  is 
based  on  actions  rather  than  states.  Moreover,  neither  a  syntax-directed  proof 
system  nor  a  refinement  calculus  is  given. 

FOCUS.  Like  CIP  and  ProCoS,  the  FOCUS  project  also  aims  at  supporting 
the  systematic  formal  specification  and  development  of  distributed  interactive 
systems  [BDD'^92].  The  notion  of  a  trace  (here  the  term  stream  is  used)  also 
forms  the  foundation  of  the  framework  [Bro86].  Just  like  in  ProCoS,  message- 
passing  concurrency  is  emphasized:  Streams  either  range  over  actions  or  mes¬ 
sages.  A  logic  allows  the  specification  of  sets  of  traces.  The  behaviour  of  system 
components  is  described  in  FOCUS  by  means  of  stream  processing  functions 
which  specify  how  tuples  of  input  traces  are  mapped  to  tuples  of  output  traces. 
Methods  for  the  compositional  development  and  verification  of  concurrent  sys¬ 
tems  are  given  [BDD'*'92,  Bro92].  Stplen  has  augmented  the  stream  function 
model  with  assumption-commitment  specifications  and  presented  a  refinement 
calculus  [SDW95]. 
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9.2  Semantic  models  for  concurrent  computa¬ 
tion 

Traces.  Traces  have  long  been  known  as  an  adequate  model  for  concurrent 
computation.  In  its  most  basic  form,  a  trace  simply  records  all  intermediate 
states  that  a  program  runs  through  during  its  execution.  A  trace  is  just  a 
possibly  infinite  sequence  of  states 

^0  — ^  §1  52  . . . 


where  every  transition  is  caused  by  the  program.  We  have  called  these  traces 
executions.  Executions  are  useful  when  a  program  is  to  be  analyzed  in  isolation, 
as  a  closed  system.  Moreover,  they  form  a  natural  generalization  of  the  partial 
and  total  correctness  behaviours  known  from  sequential  programming.  The 
problem  is  that  they  do  not  adequately  model  reactive  systems  that  are  in 
continuous  interaction  with  their  environment.  Moreover,  they  do  not  allow  the 
definition  of  a  compositional  computational  model:  the  executions  of  a  parallel 
program  Ci  ||  C2  cannot  be  obtained  from  the  executions  of  Ci  and  C2  [Mil73]. 
To  achieve  an  adequate,  compositional  model,  environment  interference  has  to 
be  taken  into  account.  Francez  and  Pnueli  were  among  the  first  people  to  realize 
this  [FP78].  As  outlined  in  Section  9.1.2,  the  behaviour  of  a  program  is  modeled 
by  sequences  of  states 

So  Si  S2  .  .  . 

where  some  of  the  steps  may  have  been  performed  by  the  environment.  Program 
and  environment  steps  are  distinguished  based  on  the  variables  that  they  change. 
Each  variable  is  assigned  to  a  parallel  process  that  owns  it  and  is  allowed  to 
change  it. 


Potential  computations.  A  compositional  treatment  can  thus  be  obtained 
without  explicit  labeling.  The  somewhat  unnatural  grouping  of  variables  in  a 
shared-variable  setting  can  be  avoided  by  using  explicit  labels  to  differentiate 
between  program  and  environment  steps.  A  potential  computation  (sometimes 
also  called  extended  sequence)  is  a  sequence  of  program  and  environment  tran¬ 
sitions  each  marked  with  the  appropriate  label 


where  each  U  is  a  label  ranging  over  {p,  e}.  Potential  computations  have  been 
used  by,  for  instance,  Jones,  Stirling,  Stplen  and  Collette  [JonSl,  Sti88,  St091, 
C0I94]  for  the  definition  of  complete,  compositional  proof  systems. 

Transition  traces.  Transition  traces  are  isomorphic  to  potential  computations. 
For  instance,  the  potential  computation, 

P.  «.  P.  P.  e.  P. 

So  - >•  Si  - >  S2  - >  S3...  Si  - >•  S,.|.i  - ¥  Sf+2  - >  «i+3  •  •  • 
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corresponds  to  the  transition  trace 


(5o,  5i)(52, 53)  .  .  .  (Si,  Si^i)(Si^2,Si+3)  •  •  • 

that  is,  state  changes  within  parentheses  are  caused  by  the  program  and  state 
changes  across  parentheses  are  caused  by  the  environment.  This  notion  of 
trace  has  been  used  in  several  places,  e.g.,  [Abr79,  Par79,  dBKPR91,  UK93b]. 
Brookes’  contribution  was  the  addition  of  the  stuttering  and  mumbling  closure 
conditions.  In  his  semantics  programs  are  modeled  as  closed  sets  of  transition 
traces  [Bro96b].  These  closure  conditions  stand  for  reflexivity  and  transitiv¬ 
ity  of  the  standard  operational  semantics,  that  is,  the  — >■*  transition  relation. 
They  allow  the  definition  of  a  semantics  that  is  is  fully  abstract  with  respect 
to  the  standard  notion  of  observational  behaviour  and  satisfies  many  natural 
laws  of  concurrent  programming.  Before  Brookes’  work,  the  only  fully  abstract 
semantics  for  a  shared- variable  concurrent  language  was  Hennessy  and  Plotkin’s 
resumption  semantics  [HP79].  However,  they  obtain  this  result  at  a  rather  heavy 
price.  Since  their  semantics  distinguishes  programs  of  different  length  —  skip 
and  skip  ;  skip,  for  instance  are  distinguished  —  the  capabilities  of  program 
contexts  had  to  be  strengthened  such  that  skip  and  skip  ;  skip  could  also  be 
distinguished  observationally.  To  this  end,  Hennessy  and  Plotkin  add  a  rather 
unnatural  coroutining  construct  to  the  language. 

Abadi  and  Lamport’s  and  Collette’s  treatment  differs  in  that  they  do  include 
a  stuttering  but  no  mumbling  closure  condition.  Full  abstraction  is  achieved  by 
none  of  the  related  work  mentioned.  Moreover,  Jones’,  Stirling’s  and  Stplen’s 
treatment  emphasizes  total  correctness  and  terminating  programs,  and  can  thus 
avoid  the  modeling  of  fairness.  Finally,  Stirling  does  not  address  local  variables 
at  all  while  Collette’s  treatment  is  not  as  abstract. 

While  transition  traces  lead  to  a  semantics  with  very  pleasant  properties, 
they  do  not  support  the  kind  of  context-sensitive  replacement  that  program 
development  through  stepwise  refinement  calls  for.  The  problem  is  that  in  a 
parallel  composition  Ci\\C2  the  transitions  contributed  by  each  of  the  compo¬ 
nents  cannot  be  distinguished.  It  is  thus  impossible  to  express  the  condition 
that  a  refining  program  C[  does  not  exhibit  more  transitions  than  C\  did  in 
the  given  context  Ci  ||  [].  To  remedy  this  situation,  we  added  labels  to  the 
language  and  the  semantics.  In  the  labeled  program  Ci  ||  ((^2),  the  transitions 
of  C2  are  singled  out  and  can  be  distinguished  from  those  of  Ci.  Formally,  a 
label  is  added  to  each  transition  indicating  whether  or  not  the  transition  is  due 
to  the  labeled  subprogram  or  not.  A  labeled  program  is  modeled  by  labeled 
transition  traces 


(sq  ,  /o)  ^o)(^l  J  J  ^1)  '  •  •  ’  ^i+1 5  ^i+l)  •  •  • 

where  L  =  p  indicates  a  program  step  and  L  =  e  indicates  an  environment 
step  [Din96] .  Labels  are  crucial  in  our  definition  of  context-sensitive  approxima¬ 
tion  Cl  >E  Cl,  because  they  allow  the  user  to  single  out  a  specific  subprogram 
syntactically  and  semantically.  The  ability  to  label  arbitrary  subprograms  then 
allows  us  to  define  a  refinement  calculus  and  to  characterize  complex  properties 
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like  eventual  entry  and  to  formulate  sufficient  conditions  for  it.  This  idea  is  by 
no  means  specific  to  transition  traces  and  would  also  work  for  potential  com¬ 
putations.  However,  labels  also  mesh  well  with  the  transition  trace  semantics. 
In  other  words,  our  semantics  based  on  labeled  transition  traces  readily  inherits 
from  Brookes’  semantics  the  elegant,  compositional  treatment  of  fair,  shared- 
variable  concurrent  programs.  Message-passing  can  easily  be  accommodated  in 
terms  of  queue- valued  variables  [Bro97]. 

While  the  underlying  compositional  model  is  quite  different,  Larsen’s  work 
on  a  context-sensitive  notion  of  simulation  for  CCS,  is  clearly  related  and  did 
provide  some  intuition  [Lar87]. 


Chapter  10 

Future  work 


We  distinguish  between  improvements,  extensions  and  applications. 


10.1  Improvements 

We  have  demonstrated  that  the  calculus  presented  in  this  thesis  works  well 
on  relatively  short,  yet  intricate  examples.  However,  it  is  not  yet  suited  as  a 
complete  and  universal  design  methodology  for  larger  programs  and  systems. 
Necessary  improvements  include  the  following: 

10.1.1  Incorporating  data  reification 

Data  reification  (data  refinement)  is  an  important  part  of  formal  program  de¬ 
sign  methodologies  [DH72,  Hoa72,  ReySl,  Jon90,  dRE99].  In  our  setting,  an 
abstraction  mapping  would  map  concrete  traces  to  abstract  traces.  Without 
having  checked  the  details,  we  suspect  the  calculus  to  mesh  well  with  data  reifi¬ 
cation.  A  good  starting  point  might  be  Brookes’  work  on  Parallel  Algol  in  which 
he  has  used  a  parametric  trace  model  to  formalize  representation  independence 
for  parallel  programs  [Bro96a]. 

10.1.2  Adding  a  procedure  mechanism 

A  procedure  mechanism  would  undoubtedly  be  helpful  and  facilitate  the  speci¬ 
fication  and  derivation  of  certain  programs.  Brookes  has  demonstrated  how  to 
extend  the  language  with  call-by-name  procedures  [Bro98].  The  handling  of  lo¬ 
cal  variables  follows  the  “possible  worlds”  or  “store  shapes”  approach  first  used 
by  Reynolds  [ReySl]  and  Oles  [01e82]  leading  to  a  more  general  definition  of 
local  variable  and  local  channel  declarations.  The  resulting  language  supports 
fair  shared- variable  concurrency,  asynchronous  message-passing,  and  recursive, 
call-by-name  procedures  leading  to  a  generalization  of  the  Kahn  principle  for 
deterministic  networks  [Kah77].  We  conjecture  that  this  model  is  also  robust 
under  the  addition  of  labels  as  described  in  this  thesis. 
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10,1 ,3  Increasing  generality 

The  development  of  certain  kinds  of  programs  is  currently  not  well  supported. 
Consider,  for  instance,  the  programs 


Cl  =  while  it  do  x:-x  +  1. 


and 


and 


C2  =  new  c=  d  =  ff  in 
while  tt  do 


’  x:=x  -h  1; 

await  c; 

x:-x  +  1; 

await  d 

new  c  =  d  =  {)  in 
while  it  do 


’  x:=x  +  1; 

c?_; 

c!  *  ; 

x:i=x  -h  1; 

d?. 

dh 

C2  can  be  viewed  as  a  “miniature  token  ring”  with  as  the  token.  Unfor¬ 
tunately,  the  AWAIT-INTRO  rule  is  not  applicable,  because  the  environment  of 
each  await  statement  cannot  be  shown  to  reduce  the  measure  as  required  by  the 
rule.  Consequently,  it  would  not  be  straight-forward,  and  maybe  impossible,  to 
refine  Ci  into  C2  or  C2  with  our  calculus  despite  the  equivalence 


Cl  =rt  C2  =rt  CL 


The  problem.  The  reason  is  that  our  (and  Stirling’s)  notion  of  assumption- 
commitment  reasoning  does  not  allow  liveness  properties  in  the  assumptions 
or  guarantees.  As  described  in  Section  9.1.2,  this  restriction  prevents  cir¬ 
cular  reasoning  and  thus  ensures  soundness.  While  some  other  approziches, 
e.g.,  [FP78,  AL93],  manage  to  break  circularity  without  imposing  this  restric¬ 
tion,  it  seems  very  hard,  if  not  impossible,  to  incorporate  the  underlying  ideas 
into  our  setting. 

A  possible  solution.  Rather  than  allowing  liveness  properties  in  assumptions 
and  commitments,  we  propose  a  different  approach.  Brookes  has  developed  a 
number  of  extremely  powerful  equivalences  involving  parallel  compositions  in 
the  scope  of  local  variable  declarations  [Bro98].  Given  the  program 


new  c  =  /  in  [c\v  ;  Ci  ||  C2] 


for  instance,  the  local  output  on  c  can  be  promoted  before  all  transitions  that 
C2  might  be  able  to  do,  if  C2  never  outputs  on  c.  Formally,  let  fc{C)  denote  the 
set  of  directions  free  in  C  where  a  direction  is  of  the  form  c!  or  c?  for  channels 
c.  If  C2  never  outputs  to  channel  c,  that  is,  c!  ^  /c(C2),  then 

new  c  =  /  in  [c\v  ;  Ci  ||  C2] 
new  c  =  Iv  in  [Ci  ||  C'2]. 
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A  similar  rule  exists  for  output.  If  C2  never  inputs  from  channel  c,  that  is, 
c?  0  /c((72),  then 


new  c  —  t;/  in  [c?x  ;Ci\\  C2] 
new  c  =  I  in  [x:=v  ||  €2]^ 

These  and  other  laws  allow  the  successive  unrolling  of  a  program.  Equivalence 
between  two  programs  can  then  be  shown  by  demonstrating  that  they  satisfy  the 
same  recurrence  equations.  The  equivalence  between  the  progams  Ci,  C2  and 
C2  above,  for  instance,  can  be  shown  using  this  technique.  Brookes  has  used  this 
technique  to  verify  the  Alternating  Bit  Protocol  by  showing  its  equivalence  to  a 
one-place  buffer.  While  these  laws  would  make  our  framework  more  applicable, 
they  also  require  a  certain  amount  of  global  reasoning  and  thus  compromise 
compositionality.  More  research  is  needed  to  determine  to  what  degree  these 
laws  would  also  obviate  the  need  for  liveness  properties  in  assumptions  and 
guarantees. 


10.2  Extensions 

There  are  a  number  of  aspects  that  the  framework  might  usefully  be  extended 
with.  We  list  a  few  starting  with  lightweight,  short-term  extensions  and  ending 
with  more  long-term  investigations. 

10. 2,1  Identifying  development  tactics 

A  close  look  at  the  examples  shows  that  certain  sequences  of  transformation 
steps  occur  repeatedly.  A  natural  idea  is  to  group  these  steps  together  into 
“transformation  macros”  or  high-level  refinement  steps  a  la  the  tactics  in  Plan- 
ware  [Smi99].  For  an  example,  consider  the  maxsearch^  and  the  arraysearch 
algorithms  as  presented  in  Chapters  6.2,  6.3,  and  6.4  respectively.  These  algo¬ 
rithms  have  in  common  that  a  certain  value  is  to  be  computed  and  then  stored 
in  a  specific  variable.  For  the  purposes  of  this  discussion,  let  that  variable  be  x 
and  let  Q{x)  characterize  the  final  state,  that  is,  the  state  in  which  x  carries  the 
desired  value.  The  initial  program  Ci  in  each  of  the  examples  expresses  that  x 
may  be  changed  a  finite  number  of  times  before  the  computation  terminates  in 
a  state  with  Q{x).  To  capture  this,  each  of  the  initial  programs  is  of  the  form 

{»}:[«,  «]*;{Q(a;)}. 

The  idea  now  is  to  use  local  variables  yi  to  yn  to  first  compute  an  intermediary 
result.  The  computation  of  the  desired  value  for  x  then  uses  this  intermediary 
result.  More  precisely,  a  finite  amount  of  computation,  in  which  only  the  new 
local  variables  yi  through  yn  are  changed,  is  used  to  establish  an  intermediate 
state  in  which  ,  yn)-  The  second  refinements  in  the  above  mentioned 
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examples  thus  have  the  shape 
new  yi  =  vi,...,yn  =  Vn  in 

I  compute  intermediate  values 
compute  result  from  intermediate  values 

The  development  effort  thus  has  been  divided  into  two  parts:  the  development 
of  the  computation  of  the  intermediate  values  and  the  development  of  the  com¬ 
putation  of  the  final  result  from  these  values.  Depending  on  the  situation,  each 
part  is  now  refined  by  introducing  a  while  or  for  loop,  a  parallel  composition,  or 
a  simple  assignment.  In  case  of  the  maxsearch  algorithm,  the  first  part  is  refined 
a  into  for  loop  containing  a  parallel  composition.  The  second  part  is  replaced 
by  a  simple  assignment.  The  point  is  that  this  initial  part  of  the  development 
of  both  programs  could  be  conveniently  summarized  by  the  equivalence 

C  =r  new  2/1  =  in 

C 

where  none  of  the  t/,-  are  assigned  to  in  C  and  C  is  instantiated  with 

WW}- 

Another  possible  “development  macro”  involves  the  introduction  of  paral¬ 
lelism  using  the  PAR-INTRO  rule.  Further  research  is  needed  to  determine  the 
most  useful  ones. 

10.2.2  Enhancing  the  tractability  of  assumption-commit¬ 
ment  specifications 

The  assumptions  necessary  for  the  soundness  of  a  refinement  step  are  collected 
during  the  formal  derivation  of  the  refinement.  To  check  that  they  are  contained 
in  the  guarantees,  however,  it  is  sometimes  necessary  to  reformulate  them  and 
state  them  more  concisely.  In  recent  work,  Collette  and  Jones  investigate  ways 
to  improve  the  tractability  of  assumption-commitment  specifications  as  they 
arise  in  compositional  proof  systems  for  concurrent  programs  [CJ].  Although 
the  precise  form  of  our  specifications  is  different,  some  of  the  ideas  put  forward 
might  still  be  applicable  to  our  setting. 

10.2.3  Extending  the  framework  to  BSP  style  programs 

Skillicorn’s  extension  of  Morgan’s  refinement  calculus  to  BSP  style  programs 
does  allow  the  introduction  of  disjoint  parallelism  and  the  movement  of  values. 
Shared-variables,  however,  are  not  allowed.  It  would  be  interesting  to  extend  our 
framework  to  incorporate  BSP  style  programs.  Currently,  our  notion  of  state 
models  memory  in  a  flat  and  monolithic  way.  To  capture  BSP,  the  memory  has 


{x}  :[«,«]•; 
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to  be  partitioned  into  local  memories  for  each  processor.  The  data  movement 
primitives  have  to  be  given  a  trace  semantics.  To  incorporate  Skillicorn’s  ideas, 
the  calculus  then  has  to  be  augmented  with  read  and  write  frames,  and  location 
predicates. 

An  interesting  point  in  this  respect  is  that  some  of  the  examples  already  are 
in  a  BSP  style.  Consider,  for  instance,  the  Floyd-Warshall  algorithm.  In  each 
iteration,  parallel  processes  are  spawned  to  compute  a  new  approximation 
and  then  joined  again. 

for  A?  =  1  to  n  do 

ll"f=i.W[*.  i]  ■■=rnin{d[i,  j],  d[i,  k]  +  d[k,  j]} 
od 

An  equivalent  yet  more  efficient  program  would  avoid  the  costly,  repeated  cre¬ 
ation  and  destruction  of  parallel  processes  by  replacing  the  loop  of  parallel 
compositions  with  a  parallel  composition  of  loops.  Equivalent  behaviour  is  en¬ 
sured  through  a  barrier  synchronization  involving  local  variables.  For  instance, 
an  equivalent  BSP-style  program  could  have  the  form 

new  donei^i  =  jf,...,  donen,n  =  ff  m 
„  ^  for  =  1  to  n  do 

(/[i,  j]  :=min{d[i,  j],  d[i,  fc]  +  d[k,  j]}; 
syncij 

_  od. 

where  synCij  abbreviates 


doneij  :=ti; 

await  VI  <  z,  j  <  n.doneij; 
doTiCi^j  . 

The  above  program  still  assumes  one  monolithic  global  memory  that  spans  all 
parallel  processes.  Concepts  to  distribute  and  move  data  must  be  introduced  to 
model  BSP  more  faithfully. 

Another,  more  interesting  example  is  the  prefixsum  algorithm  of  Section  7.1. 
Each  iteration  consists  of  two  parallel  compositions  each  involving  n  processes. 
The  first  composition  performs  the  pointer  jumping.  The  second  updates  the 
channels. 


while  d^  e  do 
emp<z/(d); 

^  if  prev[%\  ^  nil  then 
new  p  =  a:  =  0  in 
c[prei;[z]]?(p,  x);  ’ 

x[i]^prev[i]\=x  ^  x[i]^p 

I  II  prev[i]  ^  nil  then  d\tt] 
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In  each  iteration,  n  parallel  processes  are  created  and  destroyed  twice.  The 
equivalent,  more  efficient  BSP  program  contains  three  synchronization  steps. 

new  decl  in 

while  d  ^  €  do 
empty(d); 
synciy, 

if  prev[i]  ^  nil  then 
n  new  p  =  X  =  0  in 
c\prev[i]]?{pj  x); 
x[i]^prev[i]  :=x  ®  x[i],p 
end; 
synciy, 

[c[i]\{prev[i]y  x[i])  ||  if  prev[i]  ^  nil  then  d\tt]; 
sync^^i 
end 

where  decl  is 

decl  =  donei^i  =  ff^  done2,i  =  ff,  dones^i  =  ff, 

. . . , 

dc?nei,„  =  ff,  done2,n  =  ff,  dones^n  ==  ff 

and  each  of  synci^i,  sync2,i  and  sync^^i  is  a  synchronization  step  analogous  to 
the  one  above.  Development  tactics  for  the  introduction  of  such  synchronization 
points  could  be  investigated.  Note  that  the  resulting  program  above  is  more 
efficient,  but  also  a  lot  harder  to  reason  about.  Program  transformation  allows 
us  to  verify  a  simpler  representation  and  add  intricate,  performance-improving 
aspects  at  a  later  stage.  The  all-pair  shortest-paths  algorithm  in  Section  7.2 
could  be  transformed  in  a  similar  fashion.  In  general,  we  feel  that  BSP  style 
programs  are  very  amenable  to  development  through  stepwise  refinement. 

10.2.4  Extending  the  language 

Objects 

The  addition  of  object-oriented  features  is  more  difficult.  The  encapsulation 
of  state  is  already  supported.  It  is  unclear,  however,  how  inheritance  should 
be  modeled.  Existing  work  on  formal  models  for  parallel,  object-oriented  pro¬ 
gramming  might  provide  good  starting  points  for  future  work  in  this  area,  e.g., 
[Ame92,  dB91].  As  pointed  out  by  Jones  [Jon96],  object-based  features  can 
be  very  helpful  in  taming  interference  and  lead  to  a  powerful  reasoning  and 
development  theory. 

Complexity  measures 

The  current  semantics  does  not  support  complexity  considerations.  In  fact,  due 
to  the  closure  conditions  the  two  programs 

skip  and  skip* 
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are  indistinguishable.  The  ability  to  determine  and  compare  the  complexities 
of  different  refinement  options  would  greatly  aid  the  design  process.  Consider 
the  alhpair  shortest-paths  algorithm  of  Section  7.2  again.  Complexity  consid¬ 
erations  lead  us  to  reject  refinement  C4  and  choose  refinement  C4  instead.  A 
tractable  formalization  of  these  considerations  in  terms  of  a  cost  calculus  would 
be  ideal. 

A  first  step  towards  a  cost  calculus  might  be  to  equip  transitions  and  traces 
with  time  bounds  t.  A  single  transition  step  would  thus  be  of  the  form 

{s,l,t,s') 

where  t  could  indicate  the  exact,  maximal  or  minimal  duration  of  the  transition. 
The  duration  of  a  trace  would  be  the  sum  over  the  durations  of  its  transitions. 
The  duration  of  a  program  would  be  the  maximum  of  the  duration  of  its  traces. 
While  these  ideas  are  intuitive,  more  work  is  needed  to  determine  how  well  they 
mesh  with  our  theory.  The  formulations  of  the  closure  conditions  would  have  to 
be  reinvestigated.  Unless  a  stuttering  step  takes  no  time,  the  stuttering  closure 
condition  would  not  be  appropriate  any  more.  A  time-sensitive  formulation  of 
mumbling,  however,  might  be  possible. 

Even  under  the  time  extension  sketched  above  the  semantics  would  still 
would  be  insensitive  to  the  amount  of  parallelism  a  program  exhibits.  More 
precisely,  the  programs  a:  :=0 ;  a:  :=x  -f  1  and  ai  :=0  ||  ar  :=a:  + 1  would  have  the  same 
durations.  In  general,  interleaving  semantics  do  not  seem  to  be  well  suited  to 
measure  parallelism  and  a  more  radical  departure  from  our  theory  might  be 
necessary. 


10.2.5  Tool  support  for  parallel  programming 

Another,  more  applied  area  of  future  research  would  be  to  look  at  CASE  tools 
for  parallel  programming.  The  idea  is  to  use  the  concepts  presented  in  this  the¬ 
sis  for  the  implementation  of  a  software  development  environment  that  supports 
the  transformational  design  of  programs  with  explicit,  user-level  parallelism.  A 
session  with  the  system  would  consist  of  a  sequence  of  refinement  steps.  In  a 
refinement  step,  the  user  would  specify  a  part  of  the  current  program  to  be 
refined.  Then,  she  would  either  choose  from  a  list  of  applicable  refinement  or 
transformation  rules,  or  input  the  desired  result  of  the  refinement  step  upon 
which  the  system  would  search  for  a  sequence  of  rules  realizing  the  refinement. 
Assumptions  and  guarantees  would  be  collected,  discharged  in  the  existing  en¬ 
vironment  and  influence  the  introduction  of  new  parallel  components.  Ideally, 
assumptions  would  be  discharged  automatically.  However,  it  seems  that  such  a 
system  would  still  be  useful  if  discharging  the  assumptions  was  carried  out  by 
the  user,  and  the  system  just  did  the  bookkeeping.  To  the  best  of  our  knowledge 
there  is  currently  very  little  CASE  tool  support  for  parallel  programming.  The 
system  sketched  above  might  be  a  promising  first  step. 
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10.3  Applications 

Apart  from  its  intended  use  as  a  first  step  towards  a  universal  design  method¬ 
ology  for  parallel  programs,  our  work  might  also  be  applicable  to  other  areas. 

10.3.1  Formal  modeling 

A  very  promising  area  of  application  is  formal  modeling.  After  the  addition  of  a 
procedure  mechanism  the  language  will  be  expressive  enough  to  model  complex 
software  conveniently.  Moreover,  the  language  has  a  rich  and  powerful  theory. 
This  combination  makes  it  ideally  suited  for  the  analysis  of  relatively  short,  yet 
intricate  pieces  of  software  like  protocols  for  instance.  Brookes’  abovementioned 
straightforward  verification  of  the  Alternating  Bit  Protocol  for  instance  is  very 
encouraging  in  this  respect  [Bro99].  CSP  and  variants  have  successfully  been 
used  to  model  software.  For  instance,  Allen  and  Garlan  have  employed  Wright, 
an  extension  of  CSP,  to  formally  model  and  analyze  architectural  connections, 
that  is,  the  interactions  between  components  [AG97].  It  seems  that  our  lan¬ 
guage  is  at  least  as  suited  for  such  purposes  as  CSP.  While  CSP  also  features 
strong  theoretical  underpinnings  and  a  notion  of  stepwise  refinement,  a  major 
advantage  of  our  framework  is  that  unlike  CSP  it  supports  both  states  and 
messages.  Depending  on  need,  either  one  or  the  other  can  be  emphasized,  or 
both  could  be  used  in  conjunction.  We  view  this  as  a  definite  advantage.  In 
our  attempts  to  model  aspects  of  High-level  Architecture  (HLA)  for  distributed 
simulation,  for  instance,  state-based  and  event-based  formalisms  turned  out  to 
be  useful  [AGI98,  Din99a]. 

A  major  advantage  of  CSP  is  the  possibility  of  automatic  verification  using 
tools  like  FDR  [For92].  The  widespread  use  of  our  approach  for  software  anal¬ 
ysis  will  thus  depend  on  the  availability  of  automatic  verification  tools.  The 
translation  of  a  suitably  restricted  language  subset  into  a  finite  state  descrip¬ 
tion  language  for  a  model  checker  like  SMV  [BCL"’"94]  for  instance  would  be  a 
promising  first  step. 

10.3.2  Using  contextual  constraints  to  obtain  smaller  mod¬ 
els 

A  major  challenge  facing  automatic  verification  tools  like  model  checking  is  the 
state  space  explosion  problem.  Typically,  the  state  space  grows  exponentially 
in  the  number  of  parallel  components  and  thus  quickly  becomes  too  large  to 
be  tractable.  By  definition,  the  behaviour  of  a  reactive  system  depends  on 
the  behaviour  of  its  environment.  Roughly  speaking,  knowledge  about  the  en¬ 
vironment  behaviour  translates  into  knowledge  about  the  system  behaviour. 
Consequently,  knowing  that  the  environment  will  never  behave  in  a  certain  way, 
may  allow  a  substantial  simplification  of  the  system.  The  simplified  system 
may  then  give  rise  to  a  smaller  state  machine  and  thus  be  amenable  to  auto¬ 
matic  verification  while  the  original,  unsimplified  system  was  not.  There  is  some 
hope  that  precisely  this  kind  of  simplification  based  on  contextual  constraints 
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can  be  formalized  in  our  setting.  As  an  example,  consider  the  following  reac¬ 
tive  component.  Commands  are  received  along  a  channel  in,  executed  on  some 
datastructure  D,  and  the  result  is  sent  back  on  channel  out.  The  command 
heginjrecord  initiates  storage  of  commands  and  their  results  in  cmdJist.  An 
end-record  causes  the  list  to  be  sent  along  channel  out  and  reinitializes  cmdJist 
to  the  empty  list. 

cmdJist  :=  Q; 
rec:=ff-, 
while  tt  do 
m?(cmd); 
case  cmd  of 

begiu-record  :  rec\-tt ;  result  :=ok  \ 

end-record  :  rec:=ff  ;  result cmdJist ;  cmdJist  :={)  \ 

else  :  execute{cmd^  D,  result) 

end; 

out\ {result)'^ 

if  rec  then  insert{cmd^  result,  cmdJist) 

end. 

If  the  environment  never  issues  the  begin-record  command,  the  variable  cmdJist 
that  stores  the  commands  can  be  dispensed  with.  The  above  component  can 
safely  be  replaced  by  the  following,  simpler  one. 

cmdJist  :={); 
rec:=ff’, 
while  tt  do 
in?  (cmd); 

execute{cmd,  D,  result)] 
out\{result) 

end 

These  kinds  of  simplifications  may  significantly  reduce  the  size  of  the  model  and 
open  up  a  way  to  automatic  verification. 
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Chapter  11 

Conclusion 


This  thesis  presents  a  framework  for  the  formal  development  of  parallel  pro¬ 
grams.  The  framework  rests  on  the  following  four  components. 

1.  Initial  requirements,  executable  programs,  and  intermediate  combinations 
of  programs  and  abstract  requirements  are  expressed  in  terms  of  a  wide- 
spectrum  specification  language.  This  specification  language  supports  fair 
parallelism,  shared- variable  and  message-passing  concurrency  and  local 
variables  and  channels.  Labels  allow  the  separation  of  the  behaviour  of  a 
subprogram  from  its  environment.  The  expressive  power  of  the  language 
is  demonstrated  by  means  of  a  variety  of  parallel  and  distributed  programs 
and  reactive  systems. 

2.  The  language  is  given  a  compositional  trace  semantics  based  on  Brookes’ 

transition  trace  semantics  [Bro96a],  from  which  it  inherits  many  of  its 
properties.  More  precisely,  the  behaviour  of  a  program  C  is  captured  by 
a  closed  set  of  sequences  of  triples  (s,  «')  where  s  and  s'  are  states  and  / 

is  a  label  indicating  whether  the  transition  was  contributed  by  a  labeled 
subprogram  of  C  or  not.  Due  to  two  closure  conditions,  the  semantics 
validates  many  natural  laws  of  parallel  programming.  Moreover,  the  se¬ 
mantics  can  easily  be  extended  to  fine-grained  notions  of  concurrency. 

3.  A  context-sensitive  notion  of  approximation  is  defined  that  allows  the  com¬ 
parison  of  two  programs  with  respect  to  a  particular  context.  It  is  used 
not  only  for  the  definition  of  the  refinement  relation  but  also  for  the  speci¬ 
fication  and  verification  of  liveness  properties  like  eventual  entry  and  thus 
plays  a  crucial  role  in  the  calculus. 

4.  Context-sensitive  approximation  and  assumption-commitment  reasoning 
are  combined  to  form  the  refinement  relation.  This  relation  is  context- 
sensitive  and  supports  stepwise,  top-down  program  development  and  com¬ 
positional  reasoning,  the  introduction  of  local  variables  and  channels,  and 
the  seamless  treatment  of  shared- variable  and  message-passing  concur- 
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rency.  The  refinement  calculus  identifies  a  number  of  rules  that  govern 
the  refinment  relation.  Most  of  the  rules  are  compositional. 

Moreover,  the  usage  and  applicability  of  the  framework  is  demonstrated  through 
a  wide  variety  of  examples  that  involve  shared-variable  parallel  programs,  dis¬ 
tributed  programs,  and  mutual  exclusion  algorithms.  All  but  one  example  em¬ 
ploy  an  arbitrary  but  fixed  number  of  parallel  processes  to  implement  the  un¬ 
derlying  algorithms.  The  refinement  calculus  allows  not  only  the  development 
of  a  single  implementation,  but  also  the  documentation  of  design  decisions  and 
the  principled  exploration  of  alternative  solutions.  The  formal  analysis  of  the 
n-process  tie-breaker  algorithm  for  mutual  exclusion,  for  instance,  also  gives  rise 
to  the  discovery  of  alternative  implementations  some  of  which  exhibit  substan¬ 
tially  more  parallelism  than  the  standard  text  book  implementation.  Just  as 
the  semantics,  the  framework  can  easily  be  extended  to  fine-grained  notions  of 
concurrency. 

While  the  refinement  formulas  involved  in  these  examples  can  get  complex, 
they  always  remain  manageable.  This  is  because  the  algorithms  feature  a  well- 
defined  interface  between  parallel  processes  which  can  be  captured  conveniently 
in  the  refinement  formulas.  If,  however,  the  processes  of  a  parallel  program  are 
so  tightly  coupled  as  to  render  compositional  reasoning  and  development  impos¬ 
sible,  our  framework  will  not  be  applicable.  We  consider  these  kinds  of  parallel 
programs  anomalous.  Due  to  the  tight  coupling,  the  degree  of  parallelism  is 
likely  to  be  very  small,  making  it  conceptually  cleaner  and  computationally 
more  efficient  to  merge  the  parallel  processes  into  one. 

At  present,  the  most  prominent  limitations  of  our  work  include  the  lack  of  a 
procedure  mechanism,  the  lack  of  support  for  the  development  of  circular  process 
topologies  like  token  rings,  and  the  lack  of  practical,  experimental  results.  While 
more  work  is  needed  to  extend  the  framework  appropriately  and  validate  its 
practical  feasibility,  we  are  confident  that  it  can  be  done.  The  strong  theoretical 
underpinnings  of  the  framework  and  its  applicability  leave  us  convinced  that  this 
work  presents  a  very  promising  first  step  towards  a  viable,  formal  development 
methodology  for  parallel  programs. 


Appendix  A 

Proofs 


A  few  lemmas  are  needed  for  the  proofs  in  this  section. 

If  a  program  approximates  another  with  respect  to  the  fine-grained,  unclo¬ 
sured  trace  semantics  T,  then  that  approximation  also  holds  under  the  coarser- 
grained,  closed  semantics  and  T^. 

Lemma  A.l  (Closure  and  trace  and  execution  inclusion) 

1.  If  Cl  Cr  C2  [mod  7),  then  Ci  Crt  C2  [mod  V).  If  Ci  C^t  C2  [mod  7), 
then  Cl  Cj-t  C2  [mod  7). 

2.  If  Cl  C5  C2  [mod  7),  then  Ci  C^t  C2  [mod  7).  If  Ci  C^t  C2  [mod  7), 
then  Cl  C^t  C2  [mod  7). 

3.  If  r[Ci]  C  rt[[C2l  (mod  7),  then  Ci  C^t  C2  [mod  7). 

Proof: 

♦  1)  We  show  the  lemma  for  7  =:  {x}.  The  more  general  case  follows. 
Assume  Ci  Cj-  C2  [mod  x).  We  show  Ci  C^-t  C2  [mod  x).  The  proof  of 
C'l  Qri  [mod  x)  is  analogous.  Let  a  G  T^[Ci]  such  that  [x  =  v)a  G 
T^|C2l  for  some  v. 

Case:  a  G  T|CiJ.  Then,  by  assumption  and  the  definition  of  the  closure 
conditions,  there  exists  /?  G  T^[C2|  such  that  (a?  =  u/?  G  TIC2I  and 
=  ld\x. 

Case:  a  ^  T|[Ci].  Then,  there  exists  a'  G  T|Ci|  such  that  a'  is  obtained 
from  a  through  the  stuttering  closure  condition  and  [x  =  v)a'  G 
TfCiJ.  Then,  by  assumption,  there  exists  G  TIC2I  such  that 
{x  =  -y)/?'  G  T[[C2|  and  a\x  =  P^x.  Consequently,  there  is  /?  G 
T^[C2]  such  that  [x  =  v)^  G  T|[C2l  and  a\x  =  jd\x. 

This  concludes  the  proof  of  Ci  C2  [modx). 

•  2)  and  3)  are  proved  similarly  as  1)  above. 
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A.l  Programs,  contexts  and  traces  (Section  2) 

A.  1.1  Proof  of  Lemma  2.1  on  page  20 

1.  Corollary  to  Lemma  A.l  above. 

2.  Using  the  definition,  the  fairmerge  relation  can  be  shown  to  be  associative 
and  commutative,  that  is,  (a||/9)||7  =  a||(/?||7)  and  a\\/3  =  f3\\a  for  all 
traces  a,  /?,  and  7.  Associativity  and  commutativity  of  parallel  composi¬ 
tion  follows. 

3.  We  show  that  D  Cjt  implies  C  ;  D*  =7-^  C.  The  remaining 

equivalences  can  can  proved  similarly.  C ;  D*  Djt  C  follows  from  the 
fact  that  €  6  jD*  by  definition  of  the  Kleene-star  operation.  To  show 
C'^D*  Cjt  C,  assume  D  C'j-t  and  a  £  7^|C;Z>*].  Thus,  a  is 

of  the  form  a  =  aia2  where  ai  G  7^[C]  and  aj  €  Due  to  the 

assumption  02  consists  of  finite  stuttering  only.  Thus,  a  =  0103  G  T^\C\ 
due  to  the  stuttering  closure  condition.  Thus,  C\D*  C^-t  C.  The  desired 
result  follows  with  Lemma  2.2. 

4.  Directly  from  the  definition. 

5.  To  show  [Cl  V  C2]  II C3  Cr  [Cl  II C3]  V  [C2 II C3],  let  a  G  rI[Ci  V  C2]  1|  C3I. 
Thus,  a  arises  from  fairly  merging  two  traces  /3  G  T[Ci  V  C2]  and  7  G 
TIC3I.  Case:  p  G  r|Ci].  Then,  a  G  TlCiUCal.  Case:  0  G  TIC2I  Then, 
a  G  r[C2||C3l.  In  both  cases,  a  G  r[[Ci||C2]  V  [CsHC^l. 

The  second  inclusion  [Ci  V  C2]  ||  C3  2r  [C'l  ||  C3]  V  [C2  ||  C3]  is  shown 
similarly. 

6.  Directly  from  the  definition  of  new. 

7.  Directly  from  the  definition  of  new. 

8.  Directly  from  the  definition  of  C  V  C. 

9.  By  structural  induction  over  E,  ■ 

A. 1.2  Proof  of  Lemma  2.2  on  page  21 

1.  Proof  is  similar  to  the  one  for  Lemma  2.1.1. 

2.  Directly  from  the  definition  of  executions. 

3.  Directly  from  the  definition  of  the  Kleene-star  operation. 

4.  Directly  from  the  congruence  property, 

5.  Directly  from  the  definition  of  c/v:[p,q]  and  T|[U:[P,  Q]|. 

6.  Using  structural  induction  over  E. 
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7.  We  only  consider  the  special  case  where  n  =  1.  The  general  case  is  an 
easy  corollary. 

=>:  Let  a  be  a  trace  of  new  x  =  e  in  Ci  and  let  e  have  value  v  in  first{a). 
By  definition  of  new,  a  is  of  the  form  a  =  a^\x  for  some  trace  a'  of  Ci 
such  that  {x  =  v)a^  also  is  a  trace  of  Ci  By  assumption,  C2  has  a  trace 
/?'  such  that  {x  =  v)/3'  also  is  a  trace  of  C2  and  a  =  a^\x  =  l3^\x.  By 
definition  of  new,  I3'\x  also  is  a  trace  of  new  x  =  e  in  C2- 

<=:  Let  a  be  a  trace  of  Ci  such  that  {x  =  v)a  also  is  in  Ci  where  v 
is  the  value  of  e  in  first{a).  Then,  by  definition  of  new,  a\ic  is  a  trace 
of  new  X  =  e  in  Cl y  and  by  assumption  also  of  new  x  =  e  in  C2^  By 
definition  of  new,  there  exists  a  trace  /?  of  C2  such  that  {x  =  v)j3  also  is 
a  trace  of  C2  and  a\x  =  fi\x. 

8.  Directly  from  the  definitions.  ■ 

A. 2  Refinement  calculus  (Section  5.4) 

Recall  that  the  operation  [a|ar  =  v]  sets  the  value  of  x  along  all  of  a  to  v. 
Formally,  if  a  =  (so,  ^o,  So)(5i , /i,  si) . . .,  then 

[a\x  =:v]  =  ([solar  t;],  /o,  [so|ar  =  v])([si|ar  -  v]Ji,  [s[\x  -  v]) . . . . 

Lemma  A. 2  If  a  G  and  x  ^  /t'(C),  then  [a\x  =  i?]  G  T^|C]  for  all 

V  G  Dorux .  □ 

We  now  prove  the  soundness  of  the  rules  of  our  refinement  calculus. 

A. 2.1  Basic  rules 

Rule  ATOM 

1.  Due  to  the  first  two  premises  we  already  have 

[P,T]  A2  [Q,A] 

and 

[p,r]  [tt,A]. 

2.  Thus,  we  only  need  to  show 

Ai  >E  A2  [mod  V) 

where 

^={P};[n  ||pre~r] 

and 

V  =  {a?! , . . .  ,  . 
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We  first  show  execution  inclusion  with  respect  to  the  unclosed  semantics, 
that  is,  we  show 


E[{Ai)]  Ds  E[{A2)]  (modV),  (A.l) 

that  is,  for  all  a  G  ^[£'[(^2)]!  there  exists  /?  such  that 

/?  G  SIE[{A,)]}  and 
/?  =  a  (mod  V). 

Let  a  G  ^l£[(^2)]l-  is  of  the  form  a  =  0^1(5,  where  ai  and  02 

consist  of  environment  transitions  of  pre^T  only.  Due  to  the  shape  of  E, 
the  first  state  of  oi  satisfies  P  and  P  is  preserved  along  ai  due  to  the  fact 
that  P  G  r.  Consequently,  s  also  satisfies  P,  s  1=  P.  Thus,  the  transition 
(s,p,  s')  satisfies  (5,p, s')  \=P  Ac/a^  and  (s,p, s')  ^  3V.  P  Ac/a^.  By  the 
first  premise,  (s,p,  s')  ^  3V,  P  Ac/a^^  That  is,  there  are  values  vi, . . . , 
for  the  local  variables  Xi,. .  .,Xn  such  that 

(«,P,[s'ki  =  Vi,...,x„  =  u„])  t=P  Ac/yi,. 

Let  02  be  like  02  except  that  each  local  variable  Xj  is  set  to  Vj, 

a'2  =  [q2|xi  =  =  v„]. 


Let 

/?  =  ai(s,p,  [s'jxi  =  =  u„])a2. 

/?  is  interference-free  and  thus  an  execution.  Since  none  of  the  local  vari¬ 
ables  are  changed  along  02,  and  02  and  02  are  identical  otherwise,  02 
must  also  preserve  all  predicates  in  F,  that  is,  02  G  T|pre°®rj.  Thus, 
we  get  (3  G  ^l£[(^i)]l  and  a  =  0  (mod  K).  This  concludes  the  proof  of 
(A.l).  With  Lemma  A.l 

£^[(^1)]  2s*  E[{A2)]{modV) 

which  is  equivalent  to 

Ai  >E  A2  (mod  V) 

as  desired. 

Rule  SEQ 

1.  We  have  to  show 


[p,riur2]  c[-,c'2  [Q,AinA2] 


and 


[p,  riur2]  C'i;C2  [«,AinA2]. 
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(a)  Let  a  E  ;  C^l  and  let  a  [=  assump{P,Ti  Ur2).  We  need  to 

show  a  1=  guar{Q,  Ai  fl  A2). 

Case:  a  E  Tilt'd.  Then,  a  is  infinite.  Also,  by  the  first  premise  and 
weakening  a  j=  Ai  Pi  A2).  Since  a  is  infinite,  this  implies 

a  ^  guar{Q,  Ai  H  A2)  as  desired. 

Case:  a  ^  Then,  by  definition  of  a  is  of 

the  form  a  =  aia2  where  ai  E  and  02  C  Then, 

ai  assump{P^TiUT2)-  By  the  first  premise  and  weakening, 
ai  1=  guar{Qi,  Ai  D  A2)  for  some  Qi.  Due  to  Lemma  5.6,  there  is  a 
set  of  predicates  sp{'Jli)  such  that  5p(7?.i)  C  Ti  and  /\sp{TZi)  =>  Qi 
and 

[PJi]  c;  [A5p(7^l),Al]. 

Since  a  |=  assump{P,Ti  U  r2),  and  each  predicate  in  sp{lZi)  is  also 
in  Fl,  /\sp{lZi)  is  preserved  across  the  gap  between  last{ai)  and 
first{a2)i  that  is,  first{a2)  |=  /\sp{IZi)  and  Thus,  we  have/zrsi(a2)  |= 
Qi.  Thus,  a2  1=  as5t/mp(Qi,ri  ur2).  By  the  second  premise 
and  weakening,  02  1=  guar{Q ^  AiD  A2).  Consequently,  Qfia2  j= 
guar{Q^  Ai  fl  A2).  Lemma  3.3  implies  the  result. 

(b)  Analogous  to  the  proof  above  of  case  (a) . 

2.  We  need  to  show 


Ci;C2>eC[;C:^  (mod  Vi  [JV2) 


where 


E  =  {P};[[]  ||pre-riUr2] 

and 


Vi  —  ^  XfYi) . 

We  first  show 

E[{Ci;C2)]  2e  E[{C[-,a^)]  (mod  V^\JV2), 

That  is,  for  every  a  E  S\E[{C[ ;  there  exists  /S  such  that 

^E5[^[{Ci;C2)]1  and 
/?  =  a  (mod  Vi  U  ^2)- 

The  first  premise  implies 

Cl  >E,  C[  (mod  Vi) 

where 


(A.2) 


El  = 
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Due  to  Lemma  4.4  we  get  E  Q  Ei.  Thus, 

Cl  >E  C[  {mod  V1UV2).  (A.3) 

Utae£lE[{C[’,C!2)]l 

Case:  a  E  Then,  a  is  infinite.  By  (A.3)  there  exists  /?  such 

that 

•  /?  €  ^l£^[(C'i)]l  and 

•  a  =  /?  (mod  Vi  U  V2). 

Since  jS  must  also  be  infinite  we  get 

•  /?ea^[(Ci;C72)]l  and 

•  a  =  p  [mod  V\  U  V2). 

Case:  a  ^  Due  to  the  definition  of  sequential  composition, 

T^\C[ ;  Cy,  a  is  of  the  form  a  =  aia2  where  a\  is  finite  and 

c^ieSlE[{C[m 

(^2e£l{C!,)  \\p7^^TiUT2l 
By  (A.3)  there  exists  /?i  such  that 

•  /?i  G  ^I^[(Ci>]l  and 

0  ai  =  (mod  Vx  U  V2). 

Moreover,  by  the  first  premise,  the  last  state  of  ai  satisfies  Qi,  last{ai)  ^ 
Qi.  Since  a  is  an  execution,  the  last  state  of  ai  is  identical  to  the  first 
state  of  02  and  thus  first{a2)  \=  Qi.  Thus, 

a2e£lE2[{Ci,)]j 

where 

^2  =  {Qi};[C]  ||pT^"^riur2]. 

Using  the  second  premise  and  an  argument  similar  to  above,  we  obtain 
C2  >E,  C'2  [mod  Vi  U  U2). 

Thus,  there  exists  02  such  that 

•  02  e  ^I£'2[(C'2}]]  and 

•  a2  =  02  [mod  Vi  U  V2). 

Note  that  first{a2)  =  first{02)  due  to  the  definition  of  02  =  02  [mod  V). 
Thus,  lQst{0i)  and  first{02)  may  differ  only  in  the  values  they  assign  to 
the  variables  in  Vi .  Let  vi, . . . ,  be  the  values  of  xi, . . . ,  in  last{0i) 
and  let  02  ~  •  •  • ,  Since  C2  doesn’t  depend  on  any 

of  the  variables  in  Ui,  that  is,  Vx  nfv{C2)  =  0, 

E  £IE[{C2)]\ 


with  Lemma  A. 2.  Thus, 
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•  G  SlE[{Ci ;  C^m  and 

♦  Qia^  =  /?i/?2  {mod  Vi  U  Vi)- 

This  concludes  the  proof  of  (A.2).  With  Lemma  A. 1  we  get 
E[{C[;C',)]  Dsx  S[(Ci ;  C2)  (mod  Vi  U  Vi) 
which  is  equivalent  to 

C[  ;  C’2  >E  Cl  ;  C2  (mod  V^i  U  Vi). 


Rule  OR 

1.  We  have  to  show 


[P,riur2]  C'lVCi,  [Q.AinAa] 

and 

[p,riur2]  CiVC'2  [tt.AinAs]. 

(a)  Let  a  G  T^\C'i  V  Ci|  and  let  a  |=  assump(P,Ti  U  r2).  We  need  to 
show  that  a  |=  guar{P^  Ai  fl  A2). 

Case;  a  G  Then,  a  [=  assump[P^Ti  ur2).  Using  the  first 

premise,  we  get  a  guar{Q^Ai)  and  thus  ot  1=  guar{Q^^i  D  A2). 
Consequently,  [P,  Fi  U  Fi]  a  [Q,  Ai  f!  A2]  and  Lemma  3.3  implies 
the  result.  Case:  a  G  Analogous  to  above  case, 

(b)  Analogous  to  above  proof. 

2.  We  need  to  show 


Cl  V  C2  >E  c[  V  [mod  Ui  U  V2) 

where 

P={PiAP2};[[]  ||pre""FiUF2]. 

We  first  show 

E[{Ci  V  C2)]  E[{C[  V  C^)]  [mod  Ui  U  U2),  (A. 4) 

that  is,  for  every  a  G  8\E[{C[  V  C^)]]  there  exists  fd  such  that 

/?  G  £lE[{Ci  V  C2)]l  and 
fd  =  a  [mod  Vi  U  14). 

LetaeeiE[{C[yC!,)]l 

Case:  a  G  8lE[{C[)]l.  The  first  premise  implies 

Cl  >E,  c'  [mod  Vi) 
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where 

£i  =  {P};[C]  ||pr^~ri]. 

Since  pre^Ti  U  r2  pre^Ti,  we  have  E  Q  Ex  hy  Lemma  4.4.  Thus, 
also 


Cl  >E  C[  {mod  V1UV2). 

Consequently,  there  exists  /?  such  that 

/?  €  €lE[{Cx)]j  and 
/?  =  a  {mod  Vi  U  14)* 

By  definition  of  T^ICi  V  C2I, 

13  e  eiE[{Cx  V  C2)]}  and 
/?  =  a  {mod  Vi  U  V2). 

Case:  a  G  Analogous  to  case  above. 

This  concludes  the  proof  of  (A. 4).  By  Lemma  A.l  we  have 

^[(C'iVC2>]  2ft  E[{C[yC'2)]  {mod  V1UV2)] 

which  is  equivalent  to 

Cl  V  C2  Cl  V  C2  {mod  Vi  U  V2) 

as  desired. 

Rule  STAR 

Using  the  premise  we  can  show  by  induction 

[/,r]  CTyviCT  [/,A] 

for  all  n  >  0.  By  OR  we  get 

[i,T]  vr=iC*Vv  vr=i(c''r  [^.a] 

for  all  n  >  0.  The  desired  result  follows. 

Rule  OMEGA 

Similar  to  the  proof  of  STAR  using  transfinite  induction. 
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Rule  NEW 

1 .  We  have  to  show 

[P[v/x],T']  new  x  =  v  in  C^  [3x,Q,  A'] 

and 

r^]  new  a:  =  in  C  [tt^A']. 

(a)  If  P[v/x]  is  unsatisfiable,  then  both  refinements  above  are  vacuously 
true.  If  P[v/x]  is  satisfiable,  let  a  G  T^[new  x  =  in  C'J  such  that 
a  assump{P[x/v]jT^).  By  definition  of  new,  there  exists  /?  such 
that  {x  —  v)P  G  T^[C"]  and  a  is  of  the  form  a  =  (}\x.  Thus, 

/?\x  1=  assump{P[x/v]^C), 

This  implies  first{{x  =  v)/3)  P,  Moreover,  since  all  predicates  in  F' 
are  preserved  along  gaps  in  j3\x  and  the  value  of  x  does  not  change 
across  gaps  in  {x  =  v)(3^  all  predicates  in  F  are  preserved  along 
gaps  in  {x  =  v)l3.  Thus,  (x  =  v)P  [=  assump[P^V),  By  the  first 
premise,  {x  =  v)j3  [=  guar{Qj  A)  which  implies  last{(x  =  v)P)  |=  Q, 
Since  last{{x  =  v)j3)  differs  from  lasi{/3\x)  only  in  the  value  of  x^ 
we  have  last{l3\x)  }=:  3x.Q.  Moreover,  since  no  transition  along  ^\x 
changes  the  value  of  x^  P\x  also  preserves  all  predicates  in  A'.  Thus, 
P\x  \=  guar{3x,Q,  A'),  a  j=  guar{3x.Q,  A')  follows. 

(b)  Analogous  to  proof  above. 

2.  We  show 

E'[(new  x  =  V  in  C)]  Ds  £'[(new  x  =  v  in  C')]  {mod  V) 


where 

E  =  {P[v/x]};[n  Wpre^T'] 

and 

V  = 

and  X  ^  V.  That  is,  for  every  a  €  £|£'[(new  x  =  v  in  C')]]  there  must 
exist  such  that 

jd  G  5[E'[(new  a;  =  t;  in  C)]]  and 
(3  =  a  [mod  V). 

Let  a  G  ^|£'[(new  x  =  v  in  C')]],  that  is, 

a  G  ^l{P[v/^]} ;  [(new  x  =  v  in  C')  ||  pre'^F']]. 


Thus,  a  is  of  the  form  a  =  aia2  where 
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♦  Prefix  ai  is  consists  of  a  finite  number  of  environment  steps  and  its 
initial  state  satisfies  P[v/x].  By  definition  of  V  all  these  environ¬ 
ment  steps  preserve  P[v/x]  and  thus  the  last  state  of  ai  also  satisfies 
P[v/x], 

♦  Suffix  02  is  an  execution  of  (new  x  =:  in  C')  ||  That  is, 

there  is  a  trace  a2\x  such  that  F'  is  preserved  across  all  gaps  along 
a2\x  and  {x  =  i;}a2  €  T[(C")].  Since  the  value  of  x  does  not  change 
across  gaps  in  (x  =  t;}a2,  all  predicates  in  F  are  preserved  across 
gaps  in  (x  =  v)a2^  Thus,  (x  =  v)a2  ^  corresponding  execution  S 
in  II  pre^^FJ.  The  premise  of  the  NEW  rule  implies 

(C)llpre^F  {C)  \\pr^^T  {modV). 

Thus,  there  exists  a  trace  ^2  of  (C')»  such  that  all  predicates  in  F  are 
preserved  across  gaps  and  the  execution  corresponding  to  coin¬ 
cides  with  S  modulo  the  variables  in  K,  that  is,  <5  =  (J'  (mod  V).  Due 
to  the  defintion  of  equivalence  of  executions  modulo  V  (Definition  4.2 
on  page  44),  and  to  x  and  must  coincide  with  respect  to  x 

after  the  initial  state.  Consequently,  x  =  t;  in  the  initial  state  of  /SJ 
and  X  does  not  change  across  gaps  along  that  is,  (x  =  v)/?2  =  /?2* 
Thus,  there  is  an  execution  ^2  of  (new  x  =  t;  in  C)  ||  pre°^F'  such 
that  02  =  ^2  (mod  V).  Again,  due  to  the  defintion  of  equivalence  of 
executions  modulo  K,  the  initial  states  of  02  and  132  must  be  identical, 
that  is,  first(a2)  =  ^rsf(/?2)* 

Let  P  =  ai/?2.  The  trace  0102  is  an  execution  by  assumption.  Thus,  the 
final  state  of  oi  (which  must  exist,  because  oi  is  finite)  and  the  initial  state 
of  02  are  identical,  that  is,  last(ai)  =  first(a2)^  Consequently,  last(ai)  = 
fir$t(/32)-  It  follows  that  /?  is  an  execution  of 

{P[v/x]}  ;  [(new  x  =  t;  in  C)  ||  pre°°F'] 

with  o  =  01^2  (mod  V). 

Finally,  by  Lemma  A.l 

^[(new  X  =  1;  in  C}]  £'[(new  x  =  t;  in  C^)]  (mod  V) 

which  is  equivalent  to 

new  X  =  V  in  C  >E  new  x  =  t;  in  C"  (mod  V), 


Rule  WEAK 

Follows  directly  from  Lemma  5.4  on  page  64. 
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Rules  PAR,  PAR-V,  PAR-ISl,  PAR-V-N 
We  show  soundness  of  PAR. 

1.  We  have  to  show 

[PiAPz.riUTz]  C[\\C'^  [Qi  AQ2,AinA2]. 


[Pi  A  P2,  Fi  U  r2]  Ci\\C2  [tt,AinA2] 

(a)  Let  a  G  T^\Ci  ||  and  let  a  |=  assump{Pi  f\  P2,Til}T2)^  By 
definition  oiT^\C\  ||  C2I  there  exist  aj  and  ^2  such  that  ai  G 
and  a2  E  and  a  G  Ofi  |1 0:2.  We  first  show  that  every  gap  along 

a\  preserves  Fi  and  every  gap  along  ^2  preserves  r2,  that  is,  ai  |= 
assump[PiyTi)  and  a2  |=  assump{P2,T2)  Suppose  the  contrary, 
that  is,  suppose  oi  ^  assump[P\^Vi)  and  0^2  ^  asswmp(P2  5  r2). 
Consequently,  there  must  be  and  that 

ai  =  a[{sm,e,s'^)a'{ 

02  =  o'2(^n,e,t;)a'2' 

a[  =  (so,e,s'o)(si,e,s'i)  . . .  ,  e, 

where  a[  and  02  longest  prefixes  of  oi  and  02  that  satisfy 

assump{Pi^Ti)  and  assump{p2jT2)  respectively.  Formally,  the  gap 
does  not  preserve  P'  for  some  P'  G  Fi,  and 
does  not  preserve  P"  for  some  P^'  G  F2.  Let  a'  be  the  prefix  of  a 
up  to  (and  including)  (s^,e,s^)  or  whichever  comes  first 

in  a.  Note  that  a'  always  exists.  Without  loss  of  generality  as¬ 
sume  that  (syj^,e,sJ^)  comes  before  in  a.  Then,  a'  is  of 

the  form  a'  =  e,  e,  where  ^2  is  a  (possi¬ 

bly  empty)  subtrace  of  02-  We  will  show  that  the  gap  (5jyj_|.i ,  Sm) 
must  preserve  F 1  which  contradicts  the  maximality  of  0[\ .  Since  a  1= 
assump[Pi  A  P2,  Fi  U  F2)  by  assumption  and  a'  is  a  prefix  of  a,  it  fol¬ 
lows  that  a'  [=  assump{Pi  AP2,Fi  UF2)  and  a'  |=  asswmp(P2,  F2) 
by  weakening.  Thus,  a'  |=  ^uar(^^,  A2).  Since  Fi  C  A2,  also 
a'  1=  guari^t^Vi)  by  weakening.  Since  /?2  is  a  subtrace  of  a',  it 
also  preserves  Fi,  that  is,  /?2  guar[tt^T{).  Since 

a  \=  assump[P\  A  P2,  Fi  U  F2), 

it  follows  that  if  P'  G  Fi  holds  in  s^rn-i^  ^hen  it  is  preserved  not  only 
by  the  environment  but  also  along  P2  and  thus  P'  will  also  hold  in 
Sm>  This,  however,  contradicts  the  maximality  of  ai.  Thus,  we  con¬ 
clude  a\  t=  assump{PijTi)  and  a2  t=  a5stimp(P2, F2).  By  the  two 

^Note  that  this  is  not  obvious.  Every  gap  in  a  might  preserve  Fi,  while  not  every  gap  in 
ai  does. 


244 


APPENDIX  A.  PROOFS 


premises  this  implies  ai  ^  guar{Qi^  Ai)  and  a2  [=  guar{Q2jA2)^ 
Consequently,  under  the  assumptions  Pi  A  P2  and  Fi  U  r2,  every 
transition  of  Ci  and  C2  preserves  Ai  and  A2  respectively.  We  now 
show  that  a  guar{Qi  AQ21A1  fl  A2). 

♦  Every  transition  (s,  e,  s')  along  a  is  either  from  ai  or  a2  and  thus 
preserves  either  Ai  or  A2.  Consequently,  (s,e,s')  must  preserve 

Ai  n  A2. 

•  Let  a  be  finite.  Then,  ai  and  a2  are  finite.  By  assumption, 
last{ai)  f=  Qi  and  last(a2)  |=  ^2*  By  Lemma  5.6,  sp{1li)  C 
Fl  C  A2  and  sp{Tl2)  C  F2  C  Ai  sp{Tli)  =>  Qi  and  sp{1l2)  Q2 
where  Hi  denotes  the  premise  refining  Ci  into  C(  and  7^2  denotes 
the  premise  refining  C2  into  C2.  Thus,  last{a)  f=  Qi  A  ^2- 

Thus,  a  [=:  ^uar(Qi  A  Q2)  Ai  fl  A2).  Lemma  A.l  completes  the 
proof. 

2.  We  first  show 

^[(<^111^2)]  Ds  E[{c[\\c;,)]  {mod  V1UV2)  (A.5) 


where 

P={Fi  AP2};[[]  ||pr^""FiUF2]. 

That  is,  for  every  a  6  SlE[{C[  ||  C^)]!  there  exists  /3  such  that 

0  G  €lE[{Ci  11  C2)]I  and 
/?  =  a  {mod  Vi  U  V2). 

For  a  contradiction,  suppose  the  contrary,  that  is,  for  all  G  £lE[{Ci  ||  C2)]l 
with  0  =  a  {mod  V1UV2)  we  have 

l3^eiE[{Ci\\C2)]l 

Case:  There  is  a  longest  prefix  0i  of  0  that  can  be  extended  to  an  ex¬ 
ecution  in  £lE[{Ci  11  C2)]],  that  is,  let  0  =  /?i(sn,in,«n+i)/?2  such 
that 

/?i7€^[^[<Ci||C2)]1, 

for  some  7,  but 


0l{SnJn.Sn+x)S^SlE[{Ci  \\  C2)]} 

for  all  S.  Since  the  two  programs  jE'[(Ci  1|  C2)]  and  E[{C[  ||  Ci^]  have 
the  same  environment  transitions,  must  be  p,  that  is,  {snJmSn^i) 
must  be  a  program  transition. 

Subcase:  (s„ ,  Sn+i)  was  contributed  by  Ci.  Due  to  the  shape  of  P, 
every  environment  transition  along  0i  preserves  Fi.  By  the  sec¬ 
ond  premise,  every  program  transition  along  0i  contributed  by 
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C2  preserves  A2  and  thus  also  Fi,  since  Fi  C  A2  by  assumption. 
More  precisely,  for  all  transitions  [siJi.SiJ^i)  in  /?i,  if  U  =  e  or 
li  =  p  and  (siJijSi^i)  was  contributed  by  (72,  then  (siJijSi^i) 
preserves  all  predicates  in  Fi.  Thus,  there  is  an  execution  (differ¬ 
ing  from  l3i{sn,ln,Sn+i)  only  in  the  labeling)  of  (C()  in  context 
□  II  prd^Ti  that  ((7i)  in  the  same  context  cannot  exhibit,  that 
is,  there  is  an  execution  that  is  in  (C[)  ||  pre^Ti  but  not  in 
{Cl)  II  pre^Ti.  This,  however,  contradicts 

Cl  >E,  C[  {mod  Vi) 

where  Ei  =  {Pi}  ;  [[]  ||  pre^Ti]  and  thus  the  first  premise 

[Pi,ri]  [gi,Ai]. 

Subcase:  (s^-, /j, Sj+i)  was  contributed  by  (72-  In  this  case,  we  get  a 
contradiction  with 

[P2,r2]  C2^V,  C'  [Q2,A2] 

for  analogous  reasons. 

Case:  There  is  no  longest  prefix  of  ^  that  can  be  extended  to  an  execu¬ 
tion  in  £lE[{Ci  II  (72)]!.  Consequently,  (3  is  infinite.  Moreover,  there 
are  infinitely  many  program  transitions  by  {Ci  ||  along  jS.  We 
distinguish  two  cases. 

Subcase:  There  are  infinitely  many  program  transitions  by  CJ  along 
(3.  Since  all  transitions  by  C2  preserve  A2  and  thus  also  Fi, 
there  is  an  execution  (differing  from  /?  only  in  the  labeling)  of 
{C[)  in  context  [[]  ||  pre'^Fi]  that  {Ci)  in  the  same  context 
cannot  exhibit.  This,  however,  contradicts  the  first  premise 

[Pi,Ti]  Ciyv^Ci  [Qi,Ai]. 

Subcase:  There  are  infinitely  many  program  transitions  by  (7^  along 
(3.  In  this  case  we  get  a  contradiction  with  the  second  premise 

[P2,T2]  C2yv,C'^  [Q2,A2]. 

This  concludes  the  proof  of  (A. 5).  Thus,  every  transition  by  C[  1|  (7^  in 
context  E  can  be  matched  by  (7i  ||  C2  modulo  Vi  UV2j  that  is, 

^[(C'i||C2)]  3^  E[{C[)\\{C!,)]  {mod  V1UV2). 

By  Lemma  A.l  we  get 

^[(C'i||C2)]  EliCiWiC!^)]  {mod  V1UV2) 

which  is  equivalent  to 

Cl  II  C2  Ci  II  {mod  V1UV2). 
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A. 2. 2  Derived  rules 

Rule  COND 

The  premise  P  =>  {B  ^  B')  implies 

[P,  {P,  B'}]  {B}  X  {B'}  [P  A  B',  Preds(  Var)] 

and 

[P,  {P,  -.S'}]  {-.B}  >-  {-.S'}  [P  A  -iS',  Preds{  Var)] . 

Using  the  first  two  premises  and  rule  SEQ  we  get 

[P,{P,B'}uri]  {B}-,CiyvAB')\C[  [Q,Ai] 

and 

[P,{P,-nS'}ur2]  {-B};C72^i.,  bs'};^^  [g,A2]. 

The  desired  result  follows  with  OR  and  the  definition  of  COND. 

Rule  WHILE 

The  premise  I  {B  B')  implies 

[/,  {/,  B'}]  {B}  y  {B'}  [/  A  B',  Preds{  Var)] 

and 

[/,{/, -iB'}]  {-B}>-{“nB'}  [lA-^B',Pr€ds{Var)]. 
Using  the  first  premise  and  the  rules  SEQ  and  STAR  we  get 

[/,{/,  S', -.B'}ur] 

({B} ;  cr ;  {-5}  ({P'} ;  cr  ;  AB'} 

[/A-B',A]. 

Moreover,  by  SEQ  and  OMEGA  we  obtain 

[/,  {/,  B'}  U  r]  ({B}  ;  cr  yv  ({P'l ;  cr  [I  A  -B',  A] . 

The  desired  result  follows  with  OR  and  the  definition  of  while. 

Rule  FOR 

Using  each  of  the  n  premises  and  rule  SEQ  we  get 

[P[i/i],Ur.] 

C[l/i] ; . . . ;  C[n/i]  yy  C"[l/i] ; . . . ;  C'[n/i] 

[<9K*1.n.A<]. 

The  desired  result  follows  by  definition  of  for. 
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Rules  PAR-V,  PAR-N,  and  PAR-V-N 

Since  PAR-V  is  a  special  case  of  PAR,  the  soundness  is  a  straightforward  corollary. 
Soundness  of  PAR-N  and  PAR-V-N  is  shown  inductively. 

A. 2. 3  Introduction  rules 

Rule  PAR-INTRO 

Robustness  of  C  implies  by  definition  that  C*  ||”C'.  Since  C  also  pre¬ 

serves  A;,  the  finite  loop  C*  does,  too,  that  is,  C*  C-j-t  pre“(pl,- A,)-  Using 
Lemma  3.4,  we  get 

[U,Predsi^)]  C*  WUCyWUCi 

The  premises  2,  3  and  4  of  the  rule  imply 

WUCyWUCi  par-n 

Thus,  with  Lemma  5.4, 

[p,Uri]  c*y\\uci  [AiQi.n,A,]. 

Finally,  the  desired  result  follows,  because 

[p,r]  CyC  [Q,A] 

if  and  only  if 

[P,T]  C;{Q}yC’  [Q,A]. 


Rule  WHILE-INTRO 

If  m  is  always  eventually  decremented,  then  B  cannot  remain  true  forever  and 
will  eventually  be  falsified.  That  is, 

({S}  ;  {inv*  ;  Am  ;  mi;*)"'’)" 

has  no  executions.  Formally, 

rn7] 

iff]  ~  ({-B}  ;  [inv* ;  Am  ;  jnu*m)+)" 

[tt,  Preds{  Tar)] . 

Using  the  third  premise  and  weakening  we  get 
iZi  = 

{ff}r.{{B};Cr 

[tt^  Preds{  Var)] . 
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Using  the  premise 

II2  =  [BA/,r]  CyC  [I,  A] 

and  rule  OR  we  obtain 

[/,rur„] 

({/A5};C')*;{Q} 

({/A5};C7r;{Q}V{#} 

({5};Cr  ;{-5}V({5};C'r 

while  B  do  C 
[Q,A]. 


ORCri.Wj) 


Rule  FOR-INTRO 

Using  the  premise  and  FOR  we  get 

[/[o/i],U,r.] 

for  z  =  1  to  71  in  C  for  i  =  1  to  n  in  C* 

[/[n/i],n,A.] 

which  implies  the  result  by 

C*  Dj-t  for  i  =  1  to  n  in  C 

and  weakening. 

Rule  NEW-IIMTRO 

This  rule  is  a  straight-forward  consequence  of  NEW,  Lemma 2.1,  and  Lemma 5.4. 
Rule  AWAIT-INTRO 

Given  that  either  m  is  always  eventually  decremented  forever  or  at  least  until 
it  is  0,  then  -uB  cannot  be  true  forever.  That  is,  in  parallel  with 

(znt;* ;  ^4^)^  V  {inv*  ;  Am  ;  ;  {tti  =  0}  ;  inv* m 

has  no  executions.  Formally, 

m 

{-ijB}"  II  [inv*  ;  AmY  V  (int)*  ;  Am  ;  inv*m)*  ;  {m  =  0}  ;  inv*m 
[m,  Preds{  Var)\ . 
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Using  the  second  premise  and  weakening  we  get 
7^1  = 

m  ~  hBr  II D 

[tt,  Preds(  Var)] . 

Using  the  premise 

[Pi.r] 

V:[BAP2,Q2]  \\D 

[Qi,A] 

and  rule  OR  we  obtain 

[■Pi)  r  U  Fm] 

V:[BAP2,Q2]  \\D 

>-  =Tt 

[V:[BAP2,Q2]\\D]w{ff} 

>-  0R(Wi,'R2) 

[V:[BAP2,Q2]\\D]\/[{^Br  ||  D] 

(Lemma  2.1) 

[U:[B  A  P2,Q2]V  {-£}“]  ||  £> 

await  B  then  U:[P2)  Q2]  end  ||  D 
[Qi,A]. 

A. 2. 4  Proof  of  Lemma  5.6  on  page  80 

The  proof  proceeds  by  structural  induction  over  the  derivation  of 

[p,r]  CyvC  [g,A]. 

Let  the  above  refinement  be  obtained  by  a  derivation  that  ends  with  the  rule 

ATOM:  In  this  case,  C  and  C'  are  atomic  statements.  Due  to  the  premises  of 
the  rule  we  must  have 

[p,r]  c  [it,  A] 

and 

[P,r]  C'  [Q,A]. 

Since  both  assumption-commitment  formulas  were  derived  using  ASS- 
COM,  we  must  have  {P,  Q}  C  T.  Thus,  let  wp{TZ)  =  {P}  and  sp{1Z)  = 
{Q}.  The  result  follows. 
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SEQ:  In  this  case,  C  and  C  are  of  the  form  C  =  Ci  ;  C2  and  C  =  C[  •,  C2 
respectively  and  there  must  be  sets  of  predicates  Fi,  r2,  Aj  and  A2  and 
sets  of  variables  Vi  and  V2  such  that  F  =  F1UF2,  A  =  AiCI  A2,  and 
V  =  Vi  U  Vi-  Moreover,  there  must  be  a  predicate  such  that 

Ui  =  [P,F,]  CJ  [Qi.Ai] 

7^2  =  [<?i,F2]  Ciyv,C'2  [Q.AJ. 

By  induction  hypothesis,  there  are  wp['R.i)  C  Fi  and  sp(7Ji)  C  Fi  such 
that  P  =>  and  =>■  Qi  and 

[Awp(7^l),Fl]  Ciyv^Ci  [Asp(7ei),Ai]. 

Moreover,  there  also  are  wp(1l2)  and  sp{Tl2)  such  that  wp{1l2)  ^  r2  and 
sp{Jl2)  C  T2  such  that  Qi  :=>  I\wp{jl2)  and  f\sp{^2)  =>  Q  and 

[/\wp{Tl2)yT2]  C2  yv2  C2  [A^P(*^2))  A2]. 

Thus, 

[Asp(7ei),F2]  C'2^v,C7^  [Asp(7e2),A2]. 

Let  wp{%)  =  wpiJZi)  and  sp(7£)  =  sp{Jt2)-  Using  SEQ  we  have 
[/\wp(1li,Ti  ur2]  Cl  ;C2  >-ViuV2  C[  ;C2  [A«p('^2,  Ai  fl  A2] 
which  implies  the  desired  result. 

PAR:  In  this  case,  C  and  C"  are  of  the  form  C  =  Ci  \\  C2  and  C'  =  C[  ||  C2 
respectively,  and  P  and  Q  are  of  the  from  P  =  Pi  A  P2  and  Q  =  Qi  A  Q2 
respectively.  Moreover,  there  must  be  sets  of  predicates  Fi,  r2,  Ai  and 
A2  and  sets  of  variables  Vi  and  V2  such  that  F  =  F1UF2,  A  =  AinA2, 
and  V  =  ViUV2  and 

Hi  =  [Pi,Fi]  Ciyv,C[  [QuAi] 

Tl2  =  [P2,F2]  C2^V.C^  [Q2,A2]. 

By  induction  hypothesis,  there  are  wpijli)  C  Fi  and  sp{7li)  C  Fi  such 
that  Pi  =>  /\wp{Tli)  and  /\sp{']li)  ^  Qi  and 

[A^P(*^i))^i]  CiyviC[  [/\sp(7ii), Ai]. 

Moreover,  there  also  are  wp{Jt2)  and  sp(7^2)  such  that  wp{Jl2)  ^  F2  and 
sp[ll2)  C  F2  such  that  P2  A^pC^s)  and  f\sp[Tl2)  =>  Q2  and 

[A^f(^2),  F2]  C2  yv2  C2  [A^p('^2))  A2] . 

Thus,  by  rule  PAR 

[A^p(^i)  ^  A^f('^2),Fi  u  F2] 

Cl  II C2  C[  II  Ci, 

[Asp(7^i)  A  Asp(7^2),  Ai  n  A2]. 


A.2.  REFINEMENT  CALCULUS  (SECTION  5.4) 


251 


Let  wp{ll)  =  wp{lZi)  U  wp{IZ2)  a.nd  sp{'R)  =  sp{7l2)  C  sp{lZ2) ^  Then,  the 
above  refinement  implies  the  desired  result,  because 


Awp(Tl) 

= 

f\wp{'R.i)  A /\wp{n2) 

/\sp(TZ) 

= 

/\sp{'Jli)  A  f\sp{'R.2) 

wp{ll) 

c 

Fi  u  r2 

sp{n) 

c 

Fi  u  r2 

Pi  A  P2 

=> 

f\wp{TZ) 

^sp('R.) 

Qi  A  Q2- 

NEW:  In  this  case,  C  and  C'  are  of  the  form  C  =  new  x  =  v  in  Ci  and 
C'  —  new  X  =  V  in  C[  respectively.  Predicates  P  and  Q  are  of  the  from 
P  =  P'[v/x]  and  Q  =  3x.Q^  respectively  where  x  ^  V.  Moreover,  there 
must  be  T'  and  A'  such  that  T  and  A'  arise  from  P'  and  A  respectively 
by  replacing  every  free  occurrence  of  x  by  all  values  in  DorUx  and 

-  [p',r']  CiyvC[  [Q',A']. 

By  induction  hypothesis,  there  exist  wp(lV)  and  spijV)  such  that  wp{lV)  C 
r'  and  sp{lV)  C  P'  and  P'  =>  ^ 

[A«^p(7e'),r']  Ci>-vC[  [Asp(^').a']. 

By  rule  NEW 

[A«^p('7^')[^'A].r] 

new  X  =  V  in  C\  new  x  =  v  in  C[ 

[3x./\sp(n'),A]. 

Let  sp[1l)  =  {3x.Q  I  Q  G  sp{1Z')}.  Since  3x./\sp{ll')  implies  /\sp{1l), 
we  get  by  weakening 

[Awp('^')[^’/*].r] 

new  X  =  V  in  Cl  yy  new  x  =  v  in  Ci 

[/\sp{n),A]. 

Case:  -<{/\wp(Tl'))[v/ x],  that  is,  /\wp{'R.')  is  unsatisfiable.  In  this  case, 
let  wp{Tl)  =  {ff).  Case:  {/\wp{'R,'))  is  satisfiable.  In  this  case,  let 

wp{TZ)  =  {P[v/x]  I  P  G  wp{lZ')}. 

In  both  cases,  /\wp{IZ)  implies  (^wp{7l'))[v/x].  Thus,  by  weakening 

[/\wp{n),T] 

new  X  =  V  in  Cl  yy  new  x  v  in  Ci 

[^sp{^^),A]. 
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The  above  refinement  implies  the  desired  result  because 

wp(7Z)  c  r 
sp(7^)  c  r 

P'lvjx]  =>■  /\wp{Tl) 

/\sp{Tl)  3x.Q'. 


A. 3  Example:  Prefix  sum  (Section  7.1) 

A. 3.1  Proof  of  refinement  (7.1)  on  page  146 

Throughout  the  execution  of  C4,  the  prev  mapping  partitions  the  set  of  indices 
{1, . ,  .,n}  into  several  disjoint  sequences  where  two  indices  i  and  j  are  in  the 
same  sequence  iff  one  can  be  reached  from  the  other  by  following  the  prev 
pointer.  The  proof  of  refinement  (7.1)  makes  heavy  use  of  this  property.  It 
requires  two  nested  inductions.  One  over  the  length  of  these  sequences  and 
another  over  the  number  of  sequences.  We  first  need  to  fix  some  notation. 

•  Let  iV  =  {1, . . . ,  n}  and  let  X  C  N. 

•  Let  f  :  N  N  U  {nil}  be  a  function.  We  call  /  injective  iff 

»■  #  i  (/(»)  /O')  V  /(i)  =  /O)  =  nil) 


for  all  i^j  £  N. 

•  Given  a  function  f  :  N  N  U  {nil},  we  call  (i/, . . . ,  ^*0)  a  sequence 

under  /  and  denote  it  by  [k]  iff 

-  all  its  elements  are  drawn  from  N,  that  is,  ij  £  N  for  all  0  <  j  <  /, 

-  fc  is  the  first  element,  that  is,  io  =  k, 

-  every  element  ij  is  the  image  under  /  of  its  right  neighbour  ij-i,  that 

is,  ij  =  for  all  1  <  j  <  I, 

-  the  last  element  is  mapped  to  nil,  that  is,  f{ii)  =  nil. 

•  A  sequence  [k]  is  non-trivial,  if  it  has  more  than  one  element,  that  is,  if 
length{[k])  >  1. 

A  sequence  has  the  important  property  that  the  image  of  an  element  in  the 
sequence  either  is  nil  or  is  also  an  element  of  that  sequence.  We  will  say  that 
sequences  are  closed. 

•  A  set  of  indices  A  C  {1, . . . ,  n}  is  closed  under  f  :  N  N  U  {nil}  iff 

/(z)  =m7V/(i)  G  A 


for  all  i  £  X. 
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•  Given  a  set  of  indices  X,  we  call  i  £  X  first  in  X  with  respect  to  /  iff  i  is 
not  the  image  of  any  element  in  X,  that  is, 

^3j  G  X.f(j)  =  i. 

Let  XJ  be  the  non-first  elements  of  X  with  respect  to  /,  that  is, 

XJ  =  {i  £  X  \  i  not  first  in  X  wrt  f}. 

Again,  the  subscript  /  may  be  dropped  when  it  is  safe  to  do  so.  Given 
a  sequence  [k]  under  /,  the  set  [k]~  of  non-first  elements  of  [k]  is  defined 
similarly. 

Lemma  A. 3  Let  /  :  TV  — >  NU{nil}  be  injective.  Let  X  be  closed  under  /  and 
let  [k]  be  a  sequence  under  /.  Then,  X\[k]  is  again  closed  under  /. 

Proof:  By  contradiction.  Suppose  X\[k]  is  not  closed,  that  is,  there  exists 
i  G  X\[k]  such  that  f{i)  /  nil  and  f{i)  ^  A\[A?].  Since  X  is  closed,  f{i)  G  X. 
Thus,  we  must  also  have  f{i)  G  [k],  but  i  ^  [k].  By  the  definition  of  sequences, 
there  must  be  j  G  [k]  such  that  j  ^  i  and  f{j)  —  f{i).  This,  however,  contradicts 
the  injectivity  of  /.  ■ 

We  start  by  showing  how  a  single  non-trivial  sequence  of  processes 
can  be  refined  into  Given  predicates  Pj  for  each  j,  let  Px  and  P[k] 

denote  the  obvious  extensions  of  Pj  to  an  index  set  and  sequence  respectively, 
that  is, 

Px  =  Vz'  G  X.Pi 

and 

P[,]=VzG[Ar].P,-. 

Lemma  A. 4  Let  f  :  N  N  U  {nil}  and  g  :  N  V  he  functions  where  /  is 
injective  and  let  [k]  be  a  non-trivial  sequence  under  /.  Then, 

[l[k]  A  P[k]  A  Plk]- ,  r[fc]]  Ilf'  Di  Ilf '-Oi  [f[A]  A  Q[k]  A  ,  A[fe]  U  {/}] 

where 

li  =  (8)(0  =  [l,i] 

Pi  =  x[i\  =  if(i)  /\prev[i\  =  f{i) 

P'i  =  c[t\z=  {{/{{),  g(i))) 

Qi  =  prev[i\  -  f{i)  A  x[i]  =  g{i)  ®  g{f{i)) 

Q'i  =  c[j]  =  e 

T[k]  =  {Ii,Pi,Qi  1  i  e  [fe]}  U  {P/,  Q'i  I  i  G  [k]-} 

A[/f]  =  Preds{{prev[i],  ic[z]  \  i  ^[k]\/  f{i)  ==  nil})  U 
Preds{{c[i]  \  i  ^  [k]~})  U  {/}. 
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Proof:  By  induction  over  the  length  /  of  [A:]. 

Base:  I  =  2.  Thus,  there  exists  j,  such  that  [k]  =  (i,  A:},  f{k)  =  and 
/(j)  =  niL  With  this  implies  prev[k]  =  j,  and  prev[j]  =  nil.  Using  PAR 
then  it  is  straightforward  to  show 

[/[fc]  A  f\k]  A  ,  r[k]] 

)^{c}  par 

[l[k]AQ{k]^Q[k]-,^[k]]- 

Step:  V  =  Thus,  there  exists  j  such  that  f{k)  =  prev[k]  =  j  and  [k]  = 

By  induction  hypothesis, 

[^[j]  AP[,]AP(^.j_,rL,]] 

[%]  A(5[j]A(5'^.j.,A[j]]. 

Also, 

[/fe  A  P/c  A  Pj)  F/f]  Dff  ^{c}  [/fe  A  A  Qj,  . 

Using  PAR,  we  get 

[/[,]  A  /fe  A  P[j]  A  Pfe  A  P^.j_  A  Pj,  r[j]  U  Tfc] 

Ilf’A 

11.^] A-  II  Dk  yic)  IlF^i?;-  II 
llf'A 

[/[j]  A  4  A  Qy]  AQkA  Q\j]-  A  Q'-,  A[j]  D  A*] 

which  implies  the  desired  result.  Note  that  ly]  A  4  =  I[k],  Pm-  A  Pj  =  P[fc]- , 
and  ?[,]  U  Tfc  =  r[fc].  Similarly  for  Py],  Qy],  ,  and  Ay].  ■ 

The  following  proposition  generalizes  the  above  lemma  by  showing  under 
what  circumstances  ||,^A  can  be  refined  into  \\^ D^. 

Proposition  A.l  If  /  is  injective  and  X  is  closed  under  /,  then 

[Ix  APxA  P'x- .  Tat]  ||<^  A  ^{c}  ||.^  A'  Vx  AQx  A  Q'^- ,  Ax] 

where  IxiPx^  Px-  ’  >  ^x-  ’  defined  as  in  Lemma  A.4. 

Proof:  By  induction  over  the  number  i  of  non- trivial  sequences  in  X. 
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Base:  t  =  0.  Thus,  f{i)  —  prev[i]  —  nil  for  all  i  G  X.  Consequently,  neither  Di 
nor  change  the  initial  state  in  any  way.  Using  induction  over  the  size 
of  X  it  is  straightforward  to  show 

[ixAPx^P'x-Jx]  Ilf  A  Xc  Ilf  A-  [Ix  AQxAQ'^.,Ax]. 

Step:  T  1.  Thus,  X  contains  at  least  one  non-trivial  sequence  [k].  Using 

Lemma  A. 3,  X\[k]  is  again  closed  under  /.  By  induction  hypothesis, 

[A\[a]  a  Px\[fe]  A  Plx\ik])-  >  rx\[fc]] 


ii^\W 


A 


[Ix\ik]  A  Qx\[fe]  A  (5(x\[fc])- .  ^A-\[fc]]  • 

Also,  by  Lemma  A.4, 

[/[fe]  A  P[k]  A  r[fc]]  11^^^  Dj  ;^{c}  Ilf  ^  A  [%]  ^  Qlk]  A  <3|fe],  . 

Thus,  with  PAR, 

[Ix\lk]  A  Ilk]  A  Px\[ft]  A  Pik]  A  Plx\ik])-  ^  A]-  > 

IlfA 

=Tt 


ix\[fe] 


A 


i[fc] 


A 


y{c} 


ix\[fc] 


A- 


m 


A 


IlfA 


—rt 


[A\[fc]  A  /[fc]  A  Qx\[k]  A  (5[fc]  A  <3(X\[/:])-  A  ,  Ax\[k]  n  A[/:]] 
which  implies  the  desired  result.  ■ 


Finally,  refinement  (7.1)  is  obtained  from  Proposition  A.l  by  instantiating 
X  with  {l,...,n},  strengthening  the  assumptions  (injectivity  of  prev  implies 
Pjv;  moreover,  F^v  C  P)  and  weakening  the  commitments  (injectivity  of  /  im¬ 
plies  injectivity  of  /^.  Thus,  with  Qjv  this  implies  that  prev  is  injective  upon 
termination) . 


A.4  N-process  mutual  exclusion  algorithms  (Sec¬ 
tion  8) 

A.4.1  Proof  of  Lemma  8.2  on  page  167 

Suppose  the  conditions  of  the  lemma  are  satisfied.  For  a  contradiction  assume 
that  C  violates  the  eventual  entry  property.  Thus,  there  is  a  context  E  and  a  B~ 
synchronization  statement  5  such  that  C  =  P[5]  and  S  {P}-  Due  the  defini¬ 
tion  of  the  two  ^-synchronization  statements  await  B  and  while  do  skip, 
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we  have  {5}  S  and  thus  {5}  <e  S.  Consequently,  S  {jB},  that  is, 
there  exists  a  that  is  an  execution  of  £'[(5)]  but  not  of  £[({£})].  Then,  there 
must  be  ai  and  a 2  such  that  a  =  aia 2  and  a 2  contains  infinitely  many  program 
transitions  each  of  which  is  a  stuttering  step  in  a  state  satisfying  -iB.  Let  ^2 
be  of  the  form  {siJiy$2){s2,hiSs) . . ..  Formally,  for  all  i  >  1, 

if  Ij  =  p  then  sj  =  and  sj  [=  (A. 6) 

Using  condition  3  of  Lemma  8.2  we  distinguish  two  cases. 

Case:  The  environment  keeps  on  reducing  m,  that  is,  the  environment  transi¬ 
tions,  taken  together,  are  in  In  this  case,  the  envi¬ 

ronment  must  eventually  set  m  to  0  due  to  condition  1  (m  <  0).  More 
formally,  there  must  be  an  environment  transition  (sj,e,Sj+i)  along  a2 
such  that  Sj^i  [=  m  =  0.  Since  there  are  infinitely  many  program  tran¬ 
sitions  along  a2  there  must  be  infinitely  many  program  transitions  after 
this  transition.  Let  {sk^PySk)  be  the  first.  Since  the  environment  tran¬ 
sitions  are  in  T^l{inv*m  and  by  definition  of  inv  m  and  all 

environment  transitions  also  always  preserve  m  =  0.  Consequently,  m  =  0 
in  Sk-  By  condition  2,  Sk  also  satisfies  B  which  contradicts  (A.6). 

Case:  The  environment  transitions,  taken  together,  are  in  ;  Om)*  ;  DJ 

for  some  D  which  contains  no  execution  along  which  is  true  infinitely 
often.  Thus,  holds  along  a2  only  finitely  many  times.  This,  however, 
contradicts  the  assumption  (A.6)  that  -*B  holds  infinitely  often  along 

AA.2  Proof  of  assumption-commitment  formula  (8.2) 
the  proof  of  Lemma  8.4  on  page  173 

To  establish  (8.2)  in  the  proof  of  Lemma  8.4  we  show  the  following  lemma. 
Lemma  A. 5  1. 

[P  A  Bi  A  in[i]  =  /  -  1,  {P}  U  T{i}] 

in[i]ylast[in[i]  -f  1]  :=m[i]  -f  1,  i 

[P  A  in[%]  =  /,  {P}  U  r7v\{i}] 

2. 

[P,  {P}  U  B{i}]  await  highest{i)  V  -ilast{i)  [P  A  B,  ,  {P}  U  rjv\{t}] 

3. 


02. 

■ 

in 


[P,{P}]  cr.-  [P,{P}ur;v\w] 


4. 


[P,  {P}ur{,}]  m[i]:=0  [P  A  {P}  U  rjv\{,}] 
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5. 


[P  A  6ofi,{P}ur{i}]  nci  [PA6o4-,{P}urAr\{,}] 


□ 

Using  assumption-commitment  formulas  (1),  (2)  of  this  lemma,  and  rules  SEQ 
and  FOR,  we  get 


[P  A  boti,  {P}  U 

for  /:=!  to  n  —  1  do 
'^  in[i]^  last[in[i]  -|-  1]  :=in[i]  1,  i; 

await  high€st{i)  V  -i/asf(i) 

od 

[p  A  Bi,{P}  U  r7v\{i}] . 

Note  that  the  invariant  for  the  for  loop  is 

P  A  Bi  A  in[i]  =1  —  1, 

By  assumption-commitment  formulas  (3),  (4),  and  (5),  and  rules  SEQ  and 
WHILE,  it  is  straightforward  to  conclude  (8-2).  The  while  loop  has  the  in¬ 
variant  P  A  Bi. 

Proof  of  Lemma  A. 5: 

1.  We  start  with  the  most  difficult  statement. 

[P  A  Bi  A  in[i]  =  /  -  1,  {P}  U 

[P  A  in[i]  =  Z,  {P}  U  rAr\{i}] 

for  1  <  Z  <  n  —  1  where  A,-  =  Using  ATOM, 

we  need  to  show  that 

♦  P  A  Bi  AcfAi  =>  P,  and 

♦  PA  in[i]=  /  —  1  A  c/4.  in[i]  —  /,  and 

♦  P  A  Bi  AB  a  cfAi  =>  B  for  all  B  G  ^N\{i}  • 

The  proof  is  by  cases.  If  P  is  Vaj.P'  then  let  P[i/x]  stand  for  the  predicate 
P'[i/x]  which  arises  from  P'  by  replacing  all  free  occurrences  of  x  by  i. 

(a)  Show  P  A  Bi  AcfAi  ^  ^2*  Let  (5,  s')  such  that  (s,  s')  c/a^  and  let 
s  \=  P  A  Bi.  We  need  to  show  that  s'  |=  P2[i/x]  and  s'  [==  P2[j/^]  for 
all  j  ^  i. 

Case:  Show  s'  P2[i/x].  Clearly,  s'  j=  lasi[in[i]]  —  i  which  implies 
s'  1=  in[last[in[i]\]  =  in[i]  and  thus  P2[i/x]. 
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Case:  Show  s'  ^  P2[j/^]  for  all  j  ^  i. 

Subcase:  Process  i  heis  just  entered  a  level  that  process  j  was  al¬ 
ready  on,  that  is,  in[i]  =  in[j]  in  s'.  By  in[last[in[i]]]  =  in[i] 
this  implies  in[last[in[j]]]  =  in[j]  in  state  s'  which  implies 

Subcase:  Processes  i  and  j  are  on  different  levels,  that  is, 

in[i]  in[j] 

in  s'.  Then,  the  values  of  in[j],  la$i[in[j]],  and  in[last[in[j]]] 
are  unchanged.  Since  by  assumption 

s  1=  m[/asf[2n[jf]]]  =  in\j], 

we  thus  also  have 

5'  [=  in[last[in[j]]]  —  in[j]. 

Consequently,  P2[j/^]- 

(b)  Show  P  A  Bi  AcfAi  =>  Pi[i/x].  Let  (s,s')  be  such  that  (s,s')  c/4.. 

Case:  s  \=^  P  A  highest{i).  Then,  highest{i)  is  preserved  by  Ai  and 
thus  s'  \=  Pi[i/x]. 

Case:  s  \=  P  A  -^last{i).  Thus,  there  exists  j  with  j  ^  i  such  that  j 
is  last,  that  is,  lasi[in[i]]  =  j  in  s.  Instantiating  P2  with  i  gives 
in[lasi[in[i]]]  =  m[i].  Thus,  i  and  j  are  on  the  same  level,  that 
is,  in[j]  =  in[i]  in  s.  Pi  implies  Pi\j/x]  =  highest{j)  V  tower{j). 
Since  i  and  j  are  on  the  same  level,  j  cannot  be  highest  and 
thus  tower{j)  in  s.  Since  i  is  one  level  above  j  in  s',  we  have 
s'  [=  tower{i)  which  implies  s'  ^  Pi[i/x]. 

(c)  Show  P  A  Bi  AcfAi  =>  Pi[j/x]  for  all  j  ^  2. 

Case:  s  [=  Pi[j/x]  because  s  ^  foie;er(j). 

Subcase:  s  |=  2n[2]  >  2n[;].  Then,  preserves  tower{j)  and  so, 

5u6case:  s  [=  2n[2]  <  in[j].  Then,  2  cannot  be  the  highest  process 
and  thus  either  272  [2]  =  0  or  -ilast{i)  due  to  Bi.  If  2n[2]  =  0  in 
s,  then  Ai  perserves  tower{j)  and  thus,  s'  ^  Pi[i/a?].  Assume 
-i/as^(2).  Thus,  2  shares  the  level  it  is  on  with  another  process 
k  which  stays  on  that  level  when  i  moves  on  through  the 
execution  of  Ai.  Thus,  tow€r{j)  is  preserved  by  A,-,  that  is, 
s'  f=  tower{j)  which  implies  s'  ^  Pi[j/x]. 

Case:  s  |=  Pi\j/x]  because  s  f=  high€st{j). 

Subcase:  i  was  higher  than  that  is,  s  \=  in[i]  >  in[j].  This 
case  is  impossible,  because  it  contradicts  s  ^  high€st{j). 


A.4.  MUTUAL  EXCLUSION  ALGORITHMS  (SECTION  8) 


259 


Subcase:  i  was  exactly  one  level  below  j  and  thus  joins  j  through 
transition,  that  is,  s  |=  in[i]  +  1  ==  Thus,  i  cannot 

be  highest  in  5  and  therefore  we  must  have  s  |= 

Using  P2  and  the  same  argument  as  in  the  previous  case  we 
conclude  that  there  exists  k  ^  i  such  that  in[k]  =  in[i]  and 
last[in[i]]  =  k  in  s.  Pj  implies 

Pi[k/x]  =  tower{k)  V  highest{k). 

Since  highest{j)  by  assumption,  k  cannot  be  the  highest  pro¬ 
cess,  must  have  tower{k)  in  s.  Consequently,  tower{j)  and 
thus  Pi  [j / or]  after  execution  of  Ai  in  5' . 

Subcase:  i  was  more  than  one  level  below  j,  that  is, 

s  \=  in[i]  +  1  <  in[j]. 

Then,  highest{j)  is  preserved  by  Ai  and  thus  Pi[j/x]  in  s\ 

(d)  Show  P  A  in[i]=  I  —  1  A  c/yi .  =>  in[i]  =  L  This  follows  directly  with 

ATOM. 

(e)  Show  P  A  Bi  AB  A  c/a^  =>  B  for  all  B  G  r7v\{i}.  Since 

TN\{i}  =  5iv\{z}  U  Preds{{in[j]  |  j  ^  2}) 
we  distinguish  two  cases. 

Case:  B  G  Piv\{i}  •  We  show  the  stronger  statement 

{highest{j)  V-i  last{j))  A  c/y^.  =>  {high€st{j)  V  -^last{j)) 

for  all  j  i.  Let  (5, 5')  be  such  that  (s,  s')  |=  c/4.. 

Subcase:  s  j=  highest{j).  Thus,  in[i]  <  in[j]  in  s.  This  means 
there  are  two  cases  to  consider.  If  in[i]  -1-  1  <  in[j],  then 
Ai  preserves  highest{j).  Otherwise,  i  joins  j  on  its  level  and 
becomes  last,  that  is,  since  i  ^  -^last{j)  after  execution  of 
Ai.  Thus,  s'  1=  highest{j)  V  -■>last{j). 

Subcase:  s  |=  -ilast{j).  Due  to  i  j,  Ai  will  always  preserve 

Case:  B  G  Preds({m[^']  |  j  /  2}).  Since  B  does  not  mention  2n[2], 
it  is  preserved  by  Ai.  Formally,  B  AcfA,  =>  B  for  all  B  G 
Preds{{in[j] 

Thus,  P  A  Bi  AcfAi  =>  P  and  P  A  Bi  AB  A  c/4.  =>  B  for  all  B  G  rjv\{«}- 
2.  To  prove 

[P,  {P}  U  P{2}]  await  highest{i)  V  -ilast{i)  [P  A  Bi,{P}  U  rjv'\{i}] 
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we  use  the  definition  of  the  await  statement  and  show 

{high€st{i)  V  '~^last{i)}  atom 

[P  ABi,{P}ur/v\{,}] 

and 

[P,{P}^B{i}] 

{’-^high€st{i)  A  last(i)}^  atom,  omega 

[p  AP,',{P}ur//\{,}]. 

The  desired  result  follows  with  the  OR  rule. 

3.  The  statement 

[P,{P}]  cr,-  [P,{P}ur;,\{,}] 

follows  from  the  fact  that  cr,-  does  not  change  in  or  Iasi  by  assumption. 
Thus,  all  predicates  involving  these  variables  only  will  always  be  preserved. 
It  can  be  shown  formally  using  the  rules  that  correspond  to  the  structure 
of  cr,- , 

4.  Finally, 

[P,  {Pjur^,-}]  m[i]:=0  [P  A  {P}  U  riv\{,}] 

is  proved  with  ATOM  by  showing 

P  Ac^-„[,-]:=o  =>  P  A  boti 

and  ^  ^ 

P  A  B  Acfifi^{y—Q  =>-  B 

for  all  P  E  r N\{i}  • 

5.  The  statement 

[P  A  boti,  {P}  U  r{,-}]  nci  [P  A  boU,  {P}  U 

follows  from  the  invariance  of  in  and  last.  Thus,  all  predicates  involving 
these  variables  only  will  always  be  preserved.  It  can  be  shown  formally 
using  the  rules  that  correspond  to  the  structure  of  nc,-. 
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A. 4. 3  Proof  of  Lemma  8.5  on  page  174 

We  show  the  above  lemma  by  transforming  TIE]^^  into  a  program 
with  identical  executions  where 

1.  await  Bi  is  replaced  by  (await  Bi)\~^boti^ 

2.  CTi  is  replaced  by  cri\(top^  A  5*),  and 

3.  nci  is  replaced  by  nci\boti. 

Let  Ci  be  the  ith  process  in  TIE]^^  and  let  entryi  be  the  entry  protocol  of  Ci. 
Then,  by  AWAIT,  SEQ,  and  FOR  we  derive 

Til  =  [  boti ,  Preds{  { in  [f] } )  U  {  }] 

entryi  ^  entry[ 

[topi  A  Bi,Preds{{in\j]  \  j  7^  i})  U  {Bj  \  j  7^  i}] 

where  eniry[  is  like  entryi  except  that  the  occurrence  of  await  Bi  is  replaced 
by  (await  Bi)\-^boti  where 

(await  Bi)\-yboti 
=  ({^i}  V  \-^boti 

=  {Bi  A  -^boti}  V  {—>Bi  A  -^boti}^ . 

Informally,  assuming  that  the  environment  never  changes  the  value  of  in[i]  and 
preserves  Bi  and  that  we  have  boti  initially,  the  entry  protocol  of  Ci  will  ter¬ 
minate  in  a  state  in  which  process  i  is  on  the  highest  level  and  Bi  holds.  Also, 
-^boti  will  hold  during  the  execution  of  await  Bi.  Let  cri  be  the  critical  region 
statement  in  Ci.  Since  cri  is  well-formed  and  thus  does  not  change  in  and  last 
we  can  show  (using  rules  according  to  the  structure  of  cr^)  that 

7^2  =  [iopi  A  Bi ,  {topi ,  Bi }] 

cri  ^  {cri  \topi  A  Bi) 

[topi  A  Bi ,  {botj ,  topj ,  Bj  I  j  /  ^}] . 

Then,  by  composing  the  above  two  refinements  sequentially 

Hz  =  [boti,  Preds{{in[i]})  U  {Bi}] 

entryi  ;  cn  ~  entryi  ;  [cri  [topi  A  Bi)  seq(7^i,7^2) 

[topi  A  Bi,  Preds{{in[j]  \  j  7^  z})  U  {-Bj  |  j  7^  z}]. 

Similarily,  using  the  well-formedness  of  nCi  we  can  show 

IZ4  =  [topi,{boti,topi}] 

in[i]  :=0  ;  nci  ^  in[i]  :=0  ;  {nCi\boti) 

[boti,  {botj,topj,  I  j  7^  z}] . 


ATOM,  SEQ 
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Let  Ci  be  like  Ci  except  that  await  Bi  is  repaced  by  (await  and 

cvi  is  replaced  by  cri\{topi  A  5^),  and  nci  is  replaced  by  nci\boti,  that  is, 

=  while  it  do 
entry[\-^boti; 
cri\{iopi  A  Bi); 
in[i]  :=0; 
nCi\boti 
od. 

Then,  by  sequential  composition  and  the  WHILE  rule 
7^5,,'  =  [6of,,Prec/s({m[i]})U{S,}] 

Ci  ~  C'i  seq(je3.r0.  while 
[«,  Preds{{in\j]  \  j  7^  ?})  U  {Bj  |  j  ^  i}]- 
By  n-fold  parallel  composition,  we  get 

He  =  [Ar=i^^^»5  Pr€ds{{in[i]  |  1  <  i  <  n})  U  {Bi  |  1  <  «  <  «}] 

lir^iG  PAR-N(^...) 

[ttj  Preds(0)]. 

Finally,  rule  NEW  yields 

[U,Preds{id)]  ~  [tt,  Preds{9)] 

where 

TlE^at  at  =  m[1..2]  =  0,  last  =  0,  mid[\]  =  ff,  mid[2]  ■=  ff  \n 

WUC'i- 

By  Lemma  5.5,  this  implies  that  the  two  programs  have  the  same  traces,  that 
is, 

m 

AAA  Proof  of  Refinement  Rule  8.2  on  page  183 

We  will  show  the  parallel  case  only.  The  remaining  two  sequential  cases  can  be 
proven  analogously.  More  precisely,  we  show 

|l<i<n}] 

while  3j  ^  i.-^Bj  do  skip 

1(^=1  while  -nBj  do  skip 
[tt,  Preds{  Var)] 
by  induction  over  n. 
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Base:  n  =  2.  Then  the  statement  specializes  to 
\ii,  {Bj}] 

while  -^Bj  do  skip 

>- 

while  -^Bj  do  skip 
[tt,  Preds{  Var)] 

where  j  ^  i.  This  is  easily  seen  to  be  true. 

Step:  n  =  n'  +  1.  By  induction  hypothesis  and  PAR 
[tt,  {Bj  I  1  <  i  <  n'}] 

while  3j  ^  do  skip  ||  while  do  skip 

induction  hypothesis 

j^jwhile  -iBj  do  skip  ||  while  do  skip 

llJJij^jWhile  -^Bj  do  skip 
[tt,  Preds{  Var)] . 

It  remains  to  be  shown  that 

1  1  <i  <  n'  +  l}] 

while  3j  /  do  skip 

while  3j  i.-^Bj  do  skip  ||  while  ^Bn‘+\  do  skip 

\tt,  Preds{  Var)] . 

Consider  an  execution  a  of 

while  3j  ^  i.-^Bj  do  skip  ||  while  -~^Bn'+i  do  skip 

in  a  parallel  environment  that  preserves  Bj  for  all  1  <  j  <  n'  +  1.  We 
distinguish  two  cases: 

Case:  Both  loops  terminate.  In  this  case,  a  also  is  an  execution  of 

{3i  ¥=  ;  {Vj  ^  i-Bj}. 

in  that  same  environment. 

Case:  At  least  one  of  the  loops  does  not  terminate.  In  this  case,  a  also  is 
an  execution  of 

{3i  ^  i.^Bjr 

in  that  same  environment. 

Thus,  a  is  an  execution  of  while  3^  /  i.-^Bj  do  skip  in  an  environment 
that  always  preserves  Bj  for  all  1  <  j  <  n'  +  1 .  ■ 
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A. 4. 5  Proof  of  Refinement  Rule  8.4  on  page  187 

“Ct-j”:  E[xijX2:^vi^V2]  ;x2*=^^2]  holds  due  to  the  mumbling 

closure  condition. 

We  have  to  show  that 

E[xi:=vi  ;x2:=t;2]  Qrt  E[xi,X2:=vi,V2]^ 


If  we  prove 

E^[xi:=vi;x2:-V2]  E^[xi,X2:=vi,V2]  {mod  {xi,X2})  (A.7) 

instead,  the  result  follows  with  Lemmas  A. 1.1  and  2.2.  We  show  (A.7)  by 
proving 


TlE*[{xi:^vi  ;x2:=i;2)]I 

C  T^lE^[{xi,X2:=vx,V2)]\  {mod  {xi,X2})  (A.8) 

and  then  invoking  Lemma  A. 1.3.  Let  cr  stand  for  the  initialization  of  xi  and 
X2,  that  is,  <7  =  (xi  =  vq^\,X2  =  ^^0,2)-  A  trace  of  C  that  is  not  obtained 
though  stuttering  or  mumbling  is  called  basic^  that  is,  if  a  G  TIC]  then  a  is 
a  basic  trace  of  C.  According  to  Definition  2.7  on  page  21,  to  prove  (A.8), 
we  have  to  show  that  for  every  basic  trace  a  of  E'[{x\\=vi  ;  X2:=V2)]  such 
that  (cr)a  also  is  a  basic  trace  of  E^[{x\  :=vi ;  X2  :=V2)]>  there  exists  a  trace  /3  of 
T^|£"[(xi,  X2:=:t;i,  t;2)]l  such  that  {(t)/3  also  is  a  trace  of  T^|£''[(xi,X2:=vi,t;2)]l 
and  a\xi\x2  =  0\xi\x2<  Formally, 

VaGTlF:'[(xi:=t;i;x2:=t;2>]l. 

(cr)a  G  Tl£;'[(xi  ;  X2  :=V2>]1 

3/5  6  T^lE'[{xi,xi'.=vi,V2)]l. 

(<t)/3  €  T^lE'[{xi,X2-.=vi,V2)]X 
A  a\xi\x2  =  P\xi\x2.  (A. 9) 

To  show  (A. 9)  let  a  be  a  basic  trace  of  E*[{x\  :=t;i  ;x2:=V2)]  such  that  {cr}a  also 
is  a  basic  trace  of  E^[{xi  :=t;i  ;  X2  :=V2)]*  Every  trace  of  (xi  ;  X2  :=V2)  along 
a  corresponds  to  a  subtrace 

(Si,P,  [Si\xi  =  Vi)pi{ti,p,  [ii\X2  =  V2]) 

where  /?,•  is  finite  and  possibly  empty.  Since  the  parallel  environment  does  not 
write  to  xi,  it  is  safe  to  assume  that  xi  has  value  vi  in  state  t,*,  that  is,  ti{xi)  = 
vi.  If  pi  is  the  empty  trace,  we  call  (si,p,  [5,|xi  =  vi])pi{iiyp,[ii\x2  =  t^2])  an 
uninterrupted  occurrence  of  xi  :-vi  ;X2  :-V2-  Otherwise,  we  call  it  an  interrupted 
occurrence  of  xi  ;  X2  :=i^2-  If  all  occurrences  of  xi  :=vi  ;  X2  :=V2  along  a  are 
uninterrupted,  then  a  is  called  benign.  Note  that  if  a  consists  of  environment 
transitions  only,  it  is  vacuously  benign.  Given  an  interrupted  occurrence  of 
xi:=t;i  ;x2:=t;2, 
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let 

Pl{si,P,  [sikl  =  Vl])(tl,P,  [<1X2  +  t)2]) 

be  the  corresponding  uninterrupted  occurrence.  Let  swap{a)  be  the  trace  that 
is  like  a  except  that  every  interrupted  occurrence  of  xi:=vi  ;  X2  i=V2  has  been 
replaced  by  its  corresponding  uninterrupted  occurrence.  Thus,  swap{a)  is  be¬ 
nign.  Note  that  swap{a)  still  is  a  basic  trace  of  E^[{xi  :-vi  ;a?2  :='y2)]-  Moreover, 
note  that  due  to  possible  assignments  to  X2  along  pi ,  pi  cannot  be  moved  after 
the  second  assignment  but  must  be  moved  before  the  first  assignment. 

The  following  lemma  contains  some  useful  properties  of  the  swap  operation 
and  benign  executions. 

Lemma  A. 6  Let  E"  be  as  in  Refinement  Rule  8.4,  a  be  a  basic  trace  of 
E'[{xi:=vi  \  X2:=V2)]  and  a  =  {xi  =  vo^i,X2  ~  '^0,2)*  Let  mumble{a)  be  a 
trace  that  is  the  same  as  a  except  that  all  occurrences  of 

(Si,p,  [Si|a;i  =  [ti\x2  =  1^2]) 

along  a  are  replaced  by 


(Si,p,  [Si\xi  =  Vi,X2=  '^2])- 


Thus,  if  a  is  benign,  then  mumble{a)  creates  a  trace  in  which  xi  and  X2  are 
always  updated  simultaneously. 

1.  If  (7  mentions  Xi  only  in  stuttering  steps  {B}  and  X2  in  stuttering  steps 
{jB}  and  constant  assignments  and  a  G  T|(7J  and  (<T)a  G  TIC}  and 

•  if  =  vi]  \=  E,  then  s  ^  B 

for  all  s,  then  swap{a)  G  TIC}  and  (cr)(5tyap(a))  G  T[(7|. 

This  lemma  expresses  that  under  certain  conditions  the  set  of  basic  traces 
of  C  is  closed  under  making  a  trace  benign  by  moving  the  interfering 
subtraces  pi  before  the  first  assignment. 

2.  (a)  If  a  G  T[E^[{xi  i-vi  ;  X2'--V2)]}  benign,  then 

Tnumble{a)  G  t^2)]I- 

Intuitively,  mumbling  a  benign  trace  of  E^[{xi  ;  0^2  ^=^^2)]  yields  a 
trace  of  X2  t;2)] 

(b)  If  (cr)a  G  T[E'[(iCi  :=vi  ;  X2  :=V2)]}  benign,  then 

{a){mumbl€{a))  G  T[E'[(a^i, ^2 :=fi,  1^2)]!- 

3.  The  effect  of  the  mumbling  operator  can  be  “mimicked”  by  adding  a  stut¬ 
tering  step  at  each  place  where  mumbling  takes  place  and  by  undoing 
all  changes  to  xi  and  X2^  That  is,  whenever  mumble{a)  G  TIC}  and 


266 


APPENDIX  A.  PROOFS 


{a)mumble(a)  £  T[C],  then  there  exists  /?  G  7^[C]  such  that  {a)/3  £ 
T^IC}  and 

0\xi\x2  =  mumfe/e(a)\xi\a?2. 


□ 


The  proof  of  (A. 9)  now  proceeds  as  follows.  Let  a  be  a  basic  trace  of 

E^[{xi:-vi  ;x2:=t;2)]‘ 

By  1)  it  follows  that  swap(a)  and  (cr)(sti;ap(a))  are  benign,  basic  traces  of 

E^[{xi:=vi  ;x2:=V2)l 

Using  2)  we  conclude  that 

mumble{swap{a)) 

and 

{(T){mumble{swap{a))) 

are  basic  traces  of  E'[{xi^X2:^vi^V2)]^  By  3)  we  conclude  that  there  exists  (3  in 
X2:=vi, t;2)]l  such  that 

•  {<t)I3  €  T^lE'[{xuX2:=vi,V2)]l  and 

♦  0\x\\x2  =  murnbl€{swap{a))\xi\x2> 

This  concludes  the  proof  of  (A. 9)  and  thus  of  Rule  8.4.  ■ 

A. 4. 6  Proof  of  Refinement  Rule  8.5  on  page  188 

E[xi^X2*-v\^V2\  C'fx  E[x\\^vi  ;ar2-=^2]  holds  due  to  the  mumbling 
closure  condition  and  implies  the  result. 

Let  the  contexts  J5",  and  be  as  in  the  rule,  that  is, 

E  =  new  x\  =  vq^i^X2  =  vo,2  m  E* 

where  £"  =  E''  ||  C  for  some  sequential  context  E”  and  some  program  C. 
Moreover,  given  a  synchronization  statement  await  B  in  C,  let  E***  be  such 
that  C  =  £''"[await  B],  Thus, 

E[{xi,X2  :=V1,V2)] 

=  new  xi  =  vq^i,X2  =  vo,2  in 

[£'"[(xi,a:2:=t'i,t'2>]  ||  £'"'[await  5]]. 

Due  to  the  eventual  entry  property  await  B  can  be  replaced  by  the  stuttering 
step  in  B  without  changing  the  executions  (Lemma  8.1).  That  is, 

new  xi  =  VQ^i,  X2  =  vo,2  in 
[i;"[(xi,X2:=ui,V2)]  II  £^"'[await  5]] 

=£t  new  xi  =  vo,i,  X2  =  t'o,2  in 

[£;"[(xi,X2:=t;i,t;2)]  \\E'"[{B}]]. 
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With  Rule  8.4  we  get 

new  xi  =  vq  \,X2  =  vq  2  iTo. 

[E"[{xuX2-.=  V,,V2)]\\E^"[{B}]] 

=7-t  new  xi  -  ■^0,1,  X2  -  i^o,2  in 

[E"[{xr:=v^^X2:^V2)]  ||  E"'[{B]]]. 

Finally,  by  using  the  eventual  entry  property  and  thus  Lemma  8.1  again,  we 
obtain 

new  xi  =  vo^ijX2  =  *^0,2  in 
[E'^[{xi,X2:=vi,V2)]  II  £’'"[await  B]] 

='j-t  new  xi  =  X2  =  '^0,2  in 

[E"[{x^,X2■.=v,,V2)]  II  E'"[{B}]] 

—^X  new  Xi  =  '^0,1, 2^2  =  1^0,2  in 

[E"[{x,-.=  v^-,X2-.=V2)]  II  ^^'"[{5}]] 

=st  new  Xi  —  vq^i^X2  =  1^0,2  in 

[E"[{xi:=vi  t,X2-=V2)]  II  ^'"[await  S]] 

as  desired.  ■ 
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